More on Big Numbers in C++

I’ve been rather lackadaisical about fixing my “big number” classes, but I’ve finally gotten up off the couch, and I have new versions available.

I’ve changed the names of the bigint and bigdec classes to integer and decimal, respectively, because I thought the “big…” names smelled of Java.  I also wrote a small Web page that ties the three classes together.

In a comment to a previous post, Andrew Dalke suggested some additional values I could test; and, sure enough, I found a bug (thanks).  (The bug was actually in what’s now called the integer class:  I hadn’t guarded against aliasing of the operands to expressions like x *= x.)

I think I remember someone suggesting that some users might prefer classes with “more features”.  What additional features did you have in mind?  Don’t suggest trig. functions and the like:  I’ve limited the <cmath>-like functions that take decimal and rational arguments to those that return exact values.  (You might be able to talk me into square root, but that would be successive approximation using Newton’s method which is what I know how to write.  I already have a version of the rational class that has quiet NaNs and infinities and a spaceship operator, but I’m not sure I like it.)

As I’ve said before, these classes are not intended for serious numerical work; and numerics experts probably already know where to find better implementations, or could write such themselves.

Update, 2024-02-05:  I woke up this morning having in my mind a way to make rational comparisons a bit quicker, so I made that change.  I also noticed that I had failed to remove an isinf() test of the accuracy argument to the conversion from floating point values.  (In the previous version, passing a NaN for the accuracy would trigger an exact conversion using std::frexp().  In the current version, any non-finite accuracy will do that.)

By the way, if anybody out there has access to a C++ implementation where FLT_RADIX != 2, I’d appreciate a test of the frexp() business.  I have access only to boxes where FLT_RADIX is 2.

More on the rational Class

Andrew Dalke did a few tests of my rational number library (thanks) and I thought it deserved a post of its own.

The first links to code in the function float_as_integer_ratio_impl(), which is the C implementation of the Python method as_integer_ratio() …

OK, I found it.  Yeah, I tried to use frexp(); but I tried to do some bit twidling to get the mastissa bits as an integer and that was a big mistake.  It’s easy once you know how to do it. 😎

I’ll add something like this to my rational class; but I’m not sure yet whether I want to make it only for FLT_RADIX == 2 and whether I want to allow the user to do something less than an exact conversion, falling back on the continued fractions routine if I need to.

I wrote a quick and dirty program to do the tests mentioned in the comment and compiled it:

– for Windows using an ancient Microsoft compiler which doesn’t conform even to C++11 (although it has rvalue references and type traits templates which is enough for it to get through my bignum stuff), and with a long double that’s just a double.

– for Linux using a somewhat more up-to-date GCC (C++14 at least), and with a bigger long double.

I got two different sets of results (Windows, Linux), possibly due to differences in the long double format.

– The Windows version had no trouble with either 0.0869406496067503 or 1.1100695288645402e-29, but it failed on 0.9999999999999999.

– The Linux version failed on both 0.0869406496067503 and 1.1100695288645402e-29, but it worked on 0.9999999999999999.

– Both failed on nextafter(1,0).

… 1.1100695288645402e-29 gives a “Can’t convert NaNs or infinities to bigint” because the
val0 = 1.0 / (val0  static_cast(int0)); step results in an inf.

(That should be static_cast<long double>(int0)…either WordPress or the browser itself thought that the <long double> was an unrecognized HTML tag.)

My Linux version reports that as well.  That’s a bug in my code which I’ll have to figure out if I decide to keep the continued fractions around for support of either FLT_RADIX != 2 or inexact conversions.

I urge you to look towards an existing library if all you need is SQL interoperability.

Actually, what I need is to keep my mind active in my retirement. 😎

There are some third-party database access libraries out there; but they look to me like they were written by teenagers sitting in their basements going, “Cool!”; and I’m not sure that they scale up to real-world applications.  For example, there’s one called SOCI that actually tries to make database tables act like iostreams.

Also, I think I can make my library act like a Web service client as well, and maybe even access the cloud.  We’ll see…

I’m Still Here

Not much has been happening with me lately.

I had MRI and CT scans on Tuesday and both turned out OK, so on Wednesday I signed up for the study of whether prophylactic radiation is actually effective in keeping small cell cancer out of the brain.  The study is randomized but not blind, and I’ll find out on the 3rd or 4th whether I’ll be getting the radiation.

I’ve been a bit lackadaisical about finishing my C++ rational number library.  I think it’s ready to go, but I still have a bit more testing to do.  If anybody would like to suggest a change in the design, please do.

A Database Access Library

I’ve been yakking about how I’m working on a database access library, so maybe I should show you my current design to prove it. 😎  This should be the last of the dorky programming posts for a while.

The C++ standard library has nothing like the java.sql.stuff.  There are some third-party libraries out there, but they seem kind of klunky to me, and I suspect that they don’t scale up to real world use cases.  This makes C++ a really bad choice for use in the good old “business data processing” domain.*

I wanted a design that more closely matches common C++ idioms (container/iterator for example) so that coders wouldn’t have to learn a whole new way of thinking.  It’s by no means an exact match (cursors are not really iterators, for example); but I think it’s the same general idea.

I’m not ready to share any code yet.  Indeed, some of it might embarrass me. 😎  For a variety of reasons, it’s been a couple of years since I’ve worked on it; and I need to review some ODBC ugliness that I’ve forgotten to finish the SQL/CLI implementation.  I haven’t even started on the Web client business which needs to be part of the proof of concept since I claim that I can make it work.

If there’s anybody out there who thinks this is basically a good design and would like to run with it themselves, I’ll happily share what I’ve got so far (probably about 1500 lines of code, not really a big deal yet).  I’m currently stuck on conversions between C++ and SQL types, which is central to making cursors work and depends on some ODBC stuff that I can’t remember.

My goal of getting WG21 to publish this in a TS will probably never be realized, but I think it’s worth doing anyway.  I’ll implement the rational number that I mentioned in the previous post first just to have some fun and get back in the groove; then I’ll see whether I can restart the work on the database library.**


*Business data processing will probably never be more than a niche market for C++, but that’s where the bulk of my experience is and so it’s what I know most about.

**My local PBS affiliate is starting up its quarterly pledge fortnight, so I shouldn’t have much TV to watch for a couple of weeks. 😎

Rational Math

Back when I was a C++ newbie, I decided to write a rational number class because I thought it would be a good exercise.  I revised it and added features over the years as I learned new stuff (I/O manipulators, for example).  I looked around my computer for it this morning and found a version that I wrote over a decade ago when I still didn’t have even a C++11 compiler to work with.

About 11 or 12 years ago, WG21’s numerics study group was thinking about publishing a Technical Specification (TS, a kind of warning about possible future standardization) that would propose some new number types (fixed-point types, an unbounded integer, things like that).  I confidently raised my hand and said that I had a rational number that I could add and proposed what would eventually morph into this*.  That “Numbers TS” is no longer in the works, so my rational number died.

I think I’ll revisit that to bring it at least to the C++20 level and implement it using the bigint class that I mentioned in the last post to avoid overflow.  I also want to put the I/O manipulators back just because I think they’re Really Cool. 😎

That should keep my brain working for a little while longer.


*Starting with the “[rational.math] Rational math” section, that’s what much of the C++ standard reads like.  It’s most definitely not a tutorial. 😎

Big Numbers

The code that I’ve posted about so far has been pretty light-weight.  Here are a couple of open-source classes that I hope could be useful in actual production code.

I have an unbounded integer and a big decimal that are intended to be C++ equivalents of SQL’s NUMERIC and DECIMAL types for use in a database access library that I’m working on.  They’re basically for storing big numbers and maybe doing a little bit of arithmetic.  The efficiency of my implementation might not be good enough for serious numerical work.

That’s particularly true for division.  I’m not as competent as I’d like to be in numerics; and when I tried to read about multi-word division in Knuth Vol. 2*, my eyes glazed over; and I had to revert to the good old “long division” routine that I learned in fourth grade.  I get a first trial divisor for each new digit of the quotient reasonably quickly; but if I guess wrong, which is likely, it takes linear time to get from that point to the right value.  If there’s a numerics expert out there who knows of a good way to do multi-word division, and if it turns out that I can comprehend it, I’d love to hear about it.

The two documentation papers, all the source code for both classes, and the open-source license are zipped up here.


*Knuth, Donald E., The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition

The Amtrak-Related Code

Here’s the post about the three programs I mentioned the other day about Amtrak timetables and on-time performance.  They’re programs I wrote mostly for myself to use, not to create pretty output, but to be quick and dirty ways to get me information for planning trips.

All are pretty clunky.  You first have to load raw data from the Web into your browser, then save the data in a file on your machine, then use that file as the input to a program that you run from a command line.

Most folks reading this blog probably aren’t programmers; so if they’re interested in this at all, they probably don’t want to have to compile the code for themselves.  I’ve compiled them for both Linux and Windows; the Linux version should run on a Mac.


The timetable generator

Before you worry about my code at all, check out Christopher Juckins’ timetables.  He has both current and historical timetables that are pretty PDF files of the sort that Amtrak used to publish and might be much more to your liking.

My code generates timetables that look like (but bigger):

29-30 timetable

or:

2150 timetable

One advantage is that you can create timetables with different trains for each leg of a round trip:

321-302 timetable

but that takes a bit more work on your part.  Also with a bit more work, you can create timetables for trains 421 & 422, the Texas Eagle through cars to Los Angeles, and for the Portland section of the Empire Builder or the Boston section of the Lake Shore Limited, where Dixieland Software doesn’t provide the raw data in a single file.

The documentation is here; the open-source code, if you want to play with it, is here.

If you want to just run the program, the two executables are here.  The file without any extension on the filename is for Linux; the one that ends in “.exe” is for Windows.  Just unzip the one that you want and stick it in some directory on your hard drive that’s in your PATH environment variable; then in the documentation, click on “Instructions for use” in the table of contents.

The raw data for this comes from Dixieland Software.  If you’re only creating one or two timetables, it’s probably easier to just type the URL in your browser; if you might want to generate timetables for lots of trains, I’ve put my little HTML form at https://www.cstdbill.com/train/atksked.html so that you can just load that once into your browser and bookmark it.  (Dixieland Software uses just HTTP, not HTTPS, but you’re not transmitting any secrets, so I wouldn’t worry about it.)


Two on-time performance analyzers

This is actually a library that generates an HTML table showing minimum, maximum, median, mean and standard deviation of late times for particular trains at particular stations, or the likelihood of making connections between two trains.  (This is what finally goosed me to write the trivial statistics library that I mentioned in the previous post.)  The two programs I’m talking about here are extra added attractions.

The documentation is here.  If you’re a programmer who wants to play with the open-source code, there’s a link to a zip archive in the introduction.

Most of the documentation is geeky programming stuff; so if you just want to run the programs, go straight to “Two Programs that Use the Library” in the table of contents.  You’ll find links to the executables there.


A simple SQL database

Here are some musings about a possible design for an Amtrak-related database.  Although this is intended for testing the database access library that I’m working on, I’ll include it here because it’s about Amtrak.

My current design is here.

If anybody can think of anything else I should add to it, I’d love to hear about it; although it’s not about making reservations and shouldn’t have any PII in it.

Also, I currently have no clue where to get the data to load the consists table.  If anybody knows where I might find that on the Web, please let me know.  If it’s a secret that you don’t want to disclose in a comment, you can contact me privately at was@pobox.com.


A Trivial Statics Library

A while back I wrote that I’d have some posts about some C++ code that I was working on, but then cancer got in the way.

I’ll begin with an open-source library of abecedarian statistics functions, just the minimum, maximum, median, mean, variance and standard deviation of data points that are strongly ordered.  I started with just standard containers of built-in numeric types, but quickly went over the top trying to make the library as generic as possible while keeping it simple for the built-in types.  I was basically just having fun with that.

[user documentationdoc., code and license]

I’ll have the next couple of weeks free, so I’ll try to get more examples ready for posting.  I’ve written:

– A program for generating Amtrak timetables.

– A program for analyzing on-time performance of Amtrak trains.

– A program for guessing the likelihood of making connections between Amtrak trains.

And in support of a database access library that I’m writing:

– An unbounded integer type.

– A big decimal type.

– A preliminary design for an SQL database of Amtrak trains to test it.

The database access library is nowhere near being ready for prime time yet, but I have some preliminary user documentation for the current design.

If there’s any interest, I might stick it all out on github Real Soon Now.

It’s Official

I’ll be hosting a meeting of the ISO standards committee that I serve on.

It took a little longer to set up than I had expected; but our meetings can be a little complicated.  (I hope the hotel’s sales rep. didn’t get too exasperated with me.  I try to not be one of those 20% of customers they spend 80% of their time with; but details, details don’tcha know.)

We’re expecting to have about a hundred people show up with maybe twenty or thirty more Zooming in.  The meetings are scheduled for five and a half days with Monday and Saturday mornings being plenary sessions where we handle a variety of administrivia and take formal votes.  The committee’s real work is done in smaller breakout groups, some of the groups dealing with as many as twenty or so papers during the week.

I’m looking forward to actually hosting a meeting.  I’ve gotten a great deal more from my participation than I’ve given over the years; and now that I’m retired and getting older, I think it’s time for some payback.