Reading Directories in C++

This one is trivial.

For testing my timezone code, I often need to recurse through all the directories where the Zoneinfo binary files are. I got tired of rewriting the programs depending on whether I was testing on Linux or Windows, so I decided to write a little framework that would be portable to both operating systems. It doesn’t solve all the problems (the two systems provide different information about directory entries), but at least I don’t have to completely restructure the main loops.

There are no new ideas here: how to loop through directories is really old news; and since it’s not particularly original on my part, all the code is in the public domain.

Time Zones in C++

I’m back to working on a database access library in C++ and, at present, I’m developing a library of civil time classes that would mimic SQL’s datetime types. In the next two or three days, I hope to have a timezone class ready for prime time; and I’ve decided to document it separately from the rest of the civil time library because there’s no reason why it couldn’t be a stand-alone class in its own right.

I don’t have the final code ready to share yet, but I wanted to make the design available while the class is still under development in case there’s anybody out there who would like to suggest any additional features that I haven’t thought of.

Is there anything else you think I need?

Update: I just found out that Windows’ filesystem does indeed support symbolic links. (I’m not sure what planet I’ve been living on.) I’ve also figured out an easy way to create a .zip archive with the symlinks in it* and unzip that on my Windows box. I guess I have a bit of a redesign to work on.

*I can also create a .tar.gz that’s about one fifth the size of the .zip file, but I don’t see how to get the links as links (rather than copies of the file), and the .zip file is only about 2.5Mb.

More on Big Numbers in C++

I’ve been rather lackadaisical about fixing my “big number” classes, but I’ve finally gotten up off the couch, and I have new versions available.

I’ve changed the names of the bigint and bigdec classes to integer and decimal, respectively, because I thought the “big…” names smelled of Java. I also wrote a small Web page that ties the three classes together.

In a comment to a previous post, Andrew Dalke suggested some additional values I could test; and, sure enough, I found a bug (thanks). (The bug was actually in what’s now called the integer class: I hadn’t guarded against aliasing of the operands to expressions like x *= x.)

I think I remember someone suggesting that some users might prefer classes with “more features”. What additional features did you have in mind? Don’t suggest trig. functions and the like: I’ve limited the <cmath>-like functions that take decimal and rational arguments to those that return exact values. (You might be able to talk me into square root, but that would be successive approximation using Newton’s method which is what I know how to write. I already have a version of the rational class that has quiet NaNs and infinities and a spaceship operator, but I’m not sure I like it.)

As I’ve said before, these classes are not intended for serious numerical work; and numerics experts probably already know where to find better implementations, or could write such themselves.

Update, 2024-02-05: I woke up this morning having in my mind a way to make rational comparisons a bit quicker, so I made that change. I also noticed that I had failed to remove an isinf() test of the accuracy argument to the conversion from floating point values. (In the previous version, passing a NaN for the accuracy would trigger an exact conversion using std::frexp(). In the current version, any non-finite accuracy will do that.)

By the way, if anybody out there has access to a C++ implementation where FLT_RADIX != 2, I’d appreciate a test of the frexp() business. I have access only to boxes where FLT_RADIX is 2.

More on the `rational` Class

Andrew Dalke did a few tests of my rational number library (thanks) and I thought it deserved a post of its own.

The first links to code in the function float_as_integer_ratio_impl(), which is the C implementation of the Python method as_integer_ratio() …

OK, I found it. Yeah, I tried to use frexp(); but I tried to do some bit twidling to get the mastissa bits as an integer and that was a big mistake. It’s easy once you know how to do it.

I’ll add something like this to my rational class; but I’m not sure yet whether I want to make it only for FLT_RADIX == 2 and whether I want to allow the user to do something less than an exact conversion, falling back on the continued fractions routine if I need to.

I wrote a quick and dirty program to do the tests mentioned in the comment and compiled it:

– for Windows using an ancient Microsoft compiler which doesn’t conform even to C++11 (although it has rvalue references and type traits templates which is enough for it to get through my bignum stuff), and with a long double that’s just a double.

– for Linux using a somewhat more up-to-date GCC (C++14 at least), and with a bigger long double.

I got two different sets of results (Windows, Linux), possibly due to differences in the long double format.

– The Windows version had no trouble with either 0.0869406496067503 or 1.1100695288645402e-29, but it failed on 0.9999999999999999.

– The Linux version failed on both 0.0869406496067503 and 1.1100695288645402e-29, but it worked on 0.9999999999999999.

– Both failed on nextafter(1,0).

… 1.1100695288645402e-29 gives a “Can’t convert NaNs or infinities to bigint” because the
val0 = 1.0 / (val0 – static_cast(int0)); step results in an inf.

(That should be static_cast<long double>(int0)…either WordPress or the browser itself thought that the <long double> was an unrecognized HTML tag.)

My Linux version reports that as well. That’s a bug in my code which I’ll have to figure out if I decide to keep the continued fractions around for support of either FLT_RADIX != 2 or inexact conversions.

I urge you to look towards an existing library if all you need is SQL interoperability.

Actually, what I need is to keep my mind active in my retirement.

There are some third-party database access libraries out there; but they look to me like they were written by teenagers sitting in their basements going, “Cool!”; and I’m not sure that they scale up to real-world applications. For example, there’s one called SOCI that actually tries to make database tables act like iostreams.

Also, I think I can make my library act like a Web service client as well, and maybe even access the cloud. We’ll see…

Big Numbers in C++

I finally got up off the couch and finished testing the my rational number; and I decided to put that along with the unbounded integer and big decimal code into a single library. If you’re interested, see https://www.cstdbill.com/bignum/bignum.html.

I’m Still Here

Not much has been happening with me lately.

I had MRI and CT scans on Tuesday and both turned out OK, so on Wednesday I signed up for the study of whether prophylactic radiation is actually effective in keeping small cell cancer out of the brain. The study is randomized but not blind, and I’ll find out on the 3^rd or 4^th whether I’ll be getting the radiation.

I’ve been a bit lackadaisical about finishing my C++ rational number library. I think it’s ready to go, but I still have a bit more testing to do. If anybody would like to suggest a change in the design, please do.

A Database Access Library

I’ve been yakking about how I’m working on a database access library, so maybe I should show you my current design to prove it. This should be the last of the dorky programming posts for a while.

The C++ standard library has nothing like the java.sql.stuff. There are some third-party libraries out there, but they seem kind of klunky to me, and I suspect that they don’t scale up to real world use cases. This makes C++ a really bad choice for use in the good old “business data processing” domain.*

I wanted a design that more closely matches common C++ idioms (container/iterator for example) so that coders wouldn’t have to learn a whole new way of thinking. It’s by no means an exact match (cursors are not really iterators, for example); but I think it’s the same general idea.

I’m not ready to share any code yet. Indeed, some of it might embarrass me. For a variety of reasons, it’s been a couple of years since I’ve worked on it; and I need to review some ODBC ugliness that I’ve forgotten to finish the SQL/CLI implementation. I haven’t even started on the Web client business which needs to be part of the proof of concept since I claim that I can make it work.

If there’s anybody out there who thinks this is basically a good design and would like to run with it themselves, I’ll happily share what I’ve got so far (probably about 1500 lines of code, not really a big deal yet). I’m currently stuck on conversions between C++ and SQL types, which is central to making cursors work and depends on some ODBC stuff that I can’t remember.

My goal of getting WG21 to publish this in a TS will probably never be realized, but I think it’s worth doing anyway. I’ll implement the rational number that I mentioned in the previous post first just to have some fun and get back in the groove; then I’ll see whether I can restart the work on the database library.**

*Business data processing will probably never be more than a niche market for C++, but that’s where the bulk of my experience is and so it’s what I know most about.

**My local PBS affiliate is starting up its quarterly pledge fortnight, so I shouldn’t have much TV to watch for a couple of weeks.

Rational Math

Back when I was a C++ newbie, I decided to write a rational number class because I thought it would be a good exercise. I revised it and added features over the years as I learned new stuff (I/O manipulators, for example). I looked around my computer for it this morning and found a version that I wrote over a decade ago when I still didn’t have even a C++11 compiler to work with.

About 11 or 12 years ago, WG21’s numerics study group was thinking about publishing a Technical Specification (TS, a kind of warning about possible future standardization) that would propose some new number types (fixed-point types, an unbounded integer, things like that). I confidently raised my hand and said that I had a rational number that I could add and proposed what would eventually morph into this*. That “Numbers TS” is no longer in the works, so my rational number died.

I think I’ll revisit that to bring it at least to the C++20 level and implement it using the bigint class that I mentioned in the last post to avoid overflow. I also want to put the I/O manipulators back just because I think they’re Really Cool.

That should keep my brain working for a little while longer.

*Starting with the “[rational.math] Rational math” section, that’s what much of the C++ standard reads like. It’s most definitely not a tutorial.

Big Numbers

The code that I’ve posted about so far has been pretty light-weight. Here are a couple of open-source classes that I hope could be useful in actual production code.

I have an unbounded integer and a big decimal that are intended to be C++ equivalents of SQL’s NUMERIC and DECIMAL types for use in a database access library that I’m working on. They’re basically for storing big numbers and maybe doing a little bit of arithmetic. The efficiency of my implementation might not be good enough for serious numerical work.

That’s particularly true for division. I’m not as competent as I’d like to be in numerics; and when I tried to read about multi-word division in Knuth Vol. 2*, my eyes glazed over; and I had to revert to the good old “long division” routine that I learned in fourth grade. I get a first trial divisor for each new digit of the quotient reasonably quickly; but if I guess wrong, which is likely, it takes linear time to get from that point to the right value. If there’s a numerics expert out there who knows of a good way to do multi-word division, and if it turns out that I can comprehend it, I’d love to hear about it.

The two documentation papers, all the source code for both classes, and the open-source license are zipped up here.

*Knuth, Donald E., The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition

The Amtrak-Related Code

Here’s the post about the three programs I mentioned the other day about Amtrak timetables and on-time performance. They’re programs I wrote mostly for myself to use, not to create pretty output, but to be quick and dirty ways to get me information for planning trips.

All are pretty clunky. You first have to load raw data from the Web into your browser, then save the data in a file on your machine, then use that file as the input to a program that you run from a command line.

Most folks reading this blog probably aren’t programmers; so if they’re interested in this at all, they probably don’t want to have to compile the code for themselves. I’ve compiled them for both Linux and Windows; the Linux version should run on a Mac.

The timetable generator

Before you worry about my code at all, check out Christopher Juckins’ timetables. He has both current and historical timetables that are pretty PDF files of the sort that Amtrak used to publish and might be much more to your liking.

My code generates timetables that look like (but bigger):

29-30 timetable

or:

2150 timetable

One advantage is that you can create timetables with different trains for each leg of a round trip:

321-302 timetable

but that takes a bit more work on your part. Also with a bit more work, you can create timetables for trains 421 & 422, the Texas Eagle through cars to Los Angeles, and for the Portland section of the Empire Builder or the Boston section of the Lake Shore Limited, where Dixieland Software doesn’t provide the raw data in a single file.

The documentation is here; the open-source code, if you want to play with it, is here.

If you want to just run the program, the two executables are here. The file without any extension on the filename is for Linux; the one that ends in “.exe” is for Windows. Just unzip the one that you want and stick it in some directory on your hard drive that’s in your PATH environment variable; then in the documentation, click on “Instructions for use” in the table of contents.

The raw data for this comes from Dixieland Software. If you’re only creating one or two timetables, it’s probably easier to just type the URL in your browser; if you might want to generate timetables for lots of trains, I’ve put my little HTML form at https://www.cstdbill.com/train/atksked.html so that you can just load that once into your browser and bookmark it. (Dixieland Software uses just HTTP, not HTTPS, but you’re not transmitting any secrets, so I wouldn’t worry about it.)

Two on-time performance analyzers

This is actually a library that generates an HTML table showing minimum, maximum, median, mean and standard deviation of late times for particular trains at particular stations, or the likelihood of making connections between two trains. (This is what finally goosed me to write the trivial statistics library that I mentioned in the previous post.) The two programs I’m talking about here are extra added attractions.

The documentation is here. If you’re a programmer who wants to play with the open-source code, there’s a link to a zip archive in the introduction.

Most of the documentation is geeky programming stuff; so if you just want to run the programs, go straight to “Two Programs that Use the Library” in the table of contents. You’ll find links to the executables there.

A simple SQL database

Here are some musings about a possible design for an Amtrak-related database. Although this is intended for testing the database access library that I’m working on, I’ll include it here because it’s about Amtrak.

My current design is here.

If anybody can think of anything else I should add to it, I’d love to hear about it; although it’s not about making reservations and shouldn’t have any PII in it.

Also, I currently have no clue where to get the data to load the consists table. If anybody knows where I might find that on the Web, please let me know. If it’s a secret that you don’t want to disclose in a comment, you can contact me privately at was@pobox.com.

Bill Seymour

Just another Freethought Blogs site

Trump's China obsession will hurt him and the US

Happy birthday, Mom

Life List: Merlin

New on OnlySky: How to warn the future

If You Read Nothing Else I Write, Read This. Really.

Adolescence, mini series, a review (Spoilers, of course)

Disable AI training on your YouTube videos

Web Exclusive: Aliens release abducted ICE agents (Fiction)

Link Roundup: April 2025

Stuff about bacteria and archaea.