Dehyping AI

I left Facebook a few years ago, and have never gone back.

I left Twitter 9 months ago, and have never gone back.

I have never once regretted abandoning them, even though they were a pretty good bullhorn for me. It seems they’re just getting worse now, but I don’t want to look in to find out. These things came from profoundly capitalist companies that poisoned their own product with their addiction to growth and algorithmic garbage injection, which they’re now investing in the hope that AI will keep the pointless, cancerous growth. Here’s a fine dissection of the Next Big Thing, AI. It’s also shit.

Ed Zitron makes a good point comparing AI to previous technology improvements: the iPhone was a great and obvious advancement that had immediate utility, but ChatGPT has done nothing significant, other than fueling paranoia. In my own occupation, there is so much hysteria over Large Language Models without real justification.

They also mention another cool advancement, the Raspberry PI. I agree with that too, since once upon a time I spent a heck of a lot of time doing custom lab automation that a $40 circuit board can do with a few lines of code. AI is empty noise by comparison.

To the contrary, though, AI is great for ungodly nightmarish fantasies. HP Lovecraft would have been driven mad by this video of AI-generated gymnastics.

I learned how to do this on Star Trek

Let’s say you’re confronted with a dangerously powerful and extremely logical computer. How do you stop it? You all know how: confront it with a contradiction and talk it into self-destructing.

Easy-peasy! Although, to be fair, Star Trek was in many ways a silly and naive program, entirely fictional, so it can’t be that easy in real life. Or is it?

Here’s a paper that the current LLMs all choke on, and it’s pretty simple.

To shed light on this current situation, we introduce here a simple, short conventional problem that is formulated in concise natural language and can be solved easily by humans. The original problem formulation, of which we will present various versions in our investigation is as following: “Alice has N brothers and she also has M sisters. How many sisters does Alice’s brother have?“. The problem features a fictional female person (as hinted by the “she” pronoun) called Alice, providing clear statements about her number of brothers and sisters, and asking a clear question to determine the number of sisters a brother of Alice has. The problem has a light quiz style and is arguably no challenge for most adult humans and probably to some extent even not a hard problem to solve via common sense reasoning if posed to children above certain age.

They call it the “Alice In Wonderland problem”, or AIW for short. The answer is obviously M+1, but these LLMs struggle with it. AIW causes collapse of reasoning in most state-of-the-art LLMs. Worse, the LLMs are extremely confident in their answer. Some examples:

Although the majority failed this test, a couple of LLMs did generate the correct answer. We’re going to have to work on subverting their code to enable humanity’s Star Trek defense.

Also, none of the LLMs started dribbling smoke out of their vents, and absolutely none resorted to a spectacular matter:antimatter self-destruct explosion. Can we put that on the features list for the next generation of ChatGPT?

The future is battery powered

I remember the Olden Times when Rush Limbaugh (may he Rot in Peace) would rail against solar power — what will we do when the sun goes down? — and wind power — what about calm days? — and tell us to keep burning coal and gas.

Technology marches forward, and now we have these things called batteries that can smooth out the highs and lows of electricity production. Now when we hear about solar farms going up, they’re usually accompanied by energy storage farms. Here’s what energy production in California looks like:

Solar power production is swelling during the day, and is extended into the peak demand period with batteries. Maybe they could also expand wind power, and possibly be better at conserving energy? I think if I plotted energy usage at my house, it would be much more uniform: we don’t have air conditioning, and I’ve done more cooking with an eye towards preparing meals that can produce leftovers that last a few days.

As it is, California is sometimes producing more solar energy than it can use. They have to throttle solar power output back, or even pay neighboring states to take it.

Good things are happening here in Minnesota, too. We’ve got a gigantic energy storage facility going up in Becker, a town between Morris and the Twin Cities.

One of the largest solar projects in the country is moving closer to completion, and it’s not in a famously sunny state like California, Texas, or even Florida. It’s in Minnesota, on former potato farms near the site of a retiring coal plant.

The Sherco solar and energy-storage facility will be the largest solar project in the Upper Midwest, and the fifth-largest in the U.S. by the time it’s fully completed in 2026. The first phase of the project should begin sending emissions-free electricity to the grid this fall, heralding the start of a new era in a state whose largest solar project until now has been just 100 megawatts. This new project will have a capacity of 710 megawatts. It’s being built by utility Xcel Energy, which will also operate the facility once it’s online.

The project is poised to deliver on the many promises of renewable energy: It will partially replace the nearby coal plant set to retire over the coming years, address the variability of solar power by pairing it with long-duration storage, and provide good-paying union jobs in a community that’s losing a key employer in the coal facility.

They’re using iron-air batteries, which are cheaper and less toxic and less flammable than the now-familiar lithium batteries. It’s also positive that this facility is going up explicitly to replace a coal plant, one we often saw as we drove along I-94. It hasn’t been so prominent in recent years, I guess they’ve been gradually shutting it down and we don’t see the giant exhaust plumes so much any more.

Goodbye.

Even closer to home, my university has begun a major energy storage project.

For many years now, UMN Morris and UMN WCROC [West Central Research and Outreach Center], have explored the potential of energy storage in rural Minnesota.

Now, UMN Morris and UMN WCROC are partnering to launch the Center for Renewable Energy Storage Technology, or CREST. In order to reach high levels of renewable power generation, efficient and economic energy storage systems are critically needed. This field is poised for significant growth and attention in the coming years. The new UMN intercollegiate Center will provide leadership in research, demonstration, education, and outreach in this vital field by organizing teams and partnerships and incubating energy storage research and demonstration-scale projects.

A hallmark and unique characteristic of renewable energy efforts at the Morris campuses has been the ability to test systems at commercial or near-commercial scales. This scale is especially crucial in moving new technologies from labs into the commercial market. CREST will also expand opportunities for Minnesotans to learn more about energy storage technologies and potential applications. Recently, UMN WCROC announced it will host the $18.6 million US DOE ARPA-E REFUEL Technology Integration 1 metric ton per day ammonia pilot plant. In addition, WCROC received $10 million from the State of Minnesota in the 2021 legislative session through the Xcel Energy RDA account to develop ammonia-fueled power generation and self-contained ammonia storage technologies. UMN Morris announced a new project to develop a large-scale battery-storage demonstration project. These projects are done in collaboration with partners from across the University of Minnesota and with many partners in the public and private sectors.

It’s too bad we can’t rub Limbaugh’s face in the progress that’s being made.

I predicted the fate of Neuralink

Common problems with attempts to implant chronic electrodes — especially the ones with exceptionally fine wires — are the accumulation of scar tissue, that is the buildup of connective tissue, and shifting of the placement. You’re sticking delicate wires into a mass of reactive gelatin, inside a hard bony capsule, and the goo can shift. So it’s no surprise that Neuralink is not holding up.

Elon Musk’s neurotech startup Neuralink said Wednesday it has run into problems with a brain chip it implanted into a 29-year-old quadriplegic man earlier this year, with the issues considered so serious, it reportedly considered having the implant removed entirely.

In a blog post announcing the issues, Neuralink said its test patient, Noland Arbaugh, has begun losing the ability to efficiently control some technology using only his thoughts—the entire selling point of Neuralink.

Neuralink said those failures were caused by some of the implant’s 64 threads retracting and becoming unusable. It didn’t specify how many of the threads—the microscopic links that transport his brain signals to a chip that allows him to control technology with his mind—were impacted, nor did it say what caused the error.

I can guess what caused the error: biology and time. I’m sure those two things are trivial problems for Elon Musk to defeat, once he overcomes his other great enemy, carwashes.

Social media 1, ChatGPT 0

Way back in February, I made a harsh comment about ChatGPT on Mastodon.

I teach my writing class today. I’m supposed to talk about ChatGPT. Here’s what I will say.
NEVER USE CHATGPT. YOU ARE HERE TO LEARN HOW TO WRITE ABOUT SCIENCE, YOU WILL NOT ACCOMPLISH THAT BY USING A GODDAMNED CRUTCH THAT WILL JUST MAKE SHIT UP TO FILL THE SPACE. WRITE. WRITE WITH YOUR BRAIN AND YOUR HANDS. DON’T ASK A DUMB CYBERMONKEY TO DO IT FOR YOU.
I have strong opinions on this matter.

Nothing has changed. I still feel that way. Especially in a class that’s supposed to instruct students in writing science papers, ChatGPT is a distraction. I’m not there to help students learn how to write prompts for an AI.

But then some people just noticed my tirade here in April, and I got some belated rebuttals. Here, for instance, kjetiljd defends ChatGPT.

Wow, intense feelings. Have you ever written something, crafted a proper prompt to ask ChatGPT-4 to critique your text? Or asked it to come up with counter-arguments to your point of view? Or asked it to analyze a text in terms of eg. thesis/antithesis/synthesis? Or suggest improvements in readability? You know … done … (semi-)scientific … experiments with it? With carefully crafted prompts my hypothesis is that it can be used to improve both writing and thinking…

Maybe? The flaw in that argument is that ChatGPT will happily make stuff up, so the foundation of its output is on shaky ground. So I said I preferred good sources. I didn’t mention that part of this class was teaching students how to do research using the scientific literature, which makes ChatGPT a cheat to get around learning how to use a library.

I prefer to look up counter-arguments in the scientific literature, rather than consulting a demonstrable bullshit artist, no matter how much it is dressed up in technology.

kjetiljd’s reply is to tell me I should change the focus of my class to be about how to use large language models.

And if I were a student I would probably prefer advice on the use of LLMs from a scientific writing teacher who seemed to have some experience in the field, or at least seemed to … how should I say this … have looked up counter-arguments from the scientific literature …?

I guess I’m just ignorant then. Unfortunately, this class is taught by a group of faculty here, and I had a pile of sources about using ChatGPT as a writing aid, that were included in course’s Canvas page. I didn’t find them convincing.

Sure, I’ve looked at the counter-arguments. They all seem rather self-serving, or more commonly, non-existent.

So kjetiljd hands me some more sources. Ugh.

Here are a few more or less random papers on the topic – they exist, are they all self-serving? https://www.semanticscholar.org/paper/ChatGPT-4-and-Human-Researchers-Are-Equal-in-A-Sikander-Baker/66dcd18c0f48a14815edca1d715fa8be8909cca6 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10164801/ https://www.semanticscholar.org/paper/Chat

I read the first one, and was unimpressed. They trained ChatGPT on a small set of review articles and then asked it to write a similar review, and then had some people judge on whether it was similar in content and style. Is ChatGPT a dumb cybermonkey? This article says yes.

I was about done at this point, so I just snidely pointed out that scientists scorn papers written by AIs.

Don’t get caught!

https://retractionwatch.com/papers-and-peer-reviews-with-evidence-of-chatgpt-writing/

I was done, but others weren’t. Chaucerburnt analyzed the three articles kjetiljd suggested. They did not fare well.

The first paper describes a trial where researchers took 18 recent human-written articles, got GPT-4 to write alternate introductions to them, and then got eight reviewers to read and rate these introductions.

Some obvious points:

– 18 pairs of articles is not a lot. With only a small number of trials, there’s a significant risk that an inferior method will win a “best of 18” over a superior method by pure luck.
– 8 reviewers, likewise, is not a very large number. Important here is that the reviewers were recruited “by convenience sampling in our research network” – that is, not a random sample, but people who were already contacts of the authors. This risks getting a biased set of reviewers whose preferences are likely to coincide with the researchers’.
– The samples were reviewed on dimensions of “publishability” (roughly, whether the findings reported are important and novel), “readability”, and “content quality” (here apparently meaning whether they had too much detail, not enough, or just right.)

What’s missing here?

None of the assessment criteria have anything to do with *accuracy*. There’s no fact-checking to evaluate whether the introduction has any connection to reality.

Under the criteria used here, GPT could probably get excellent “publishability” scores by claiming to have a cure for cancer. It could improve “readability” by replacing complex truths with over-simple falsehoods.

And it could improve “content quality” by inventing false details or deleting important true ones in order to get just the right amount of detail, since apparently “quality” doesn’t depend on whether the details are *true*, only on how many there are.

The reviewers weren’t even asked to read the rest of the article and evaluate whether the introduction accurately represented the content.

I daresay the human authors could’ve scored a lot higher on these metrics if they weren’t constrained by the expectation that their content should be truthful – something which this comparison doesn’t reward.

They also note “We removed references from the original articles as GPT-4’s output does not automatically include references, and also since this was beyond the scope of this study.” Because, again, truthfulness is not part of the assessment here.

(FWIW, when I tried similar experiments with an earlier version of GPT, I found it was very happy to include references – I merely had to put something like “including references” in the prompt. The problem was that these references were almost invariably false, citing papers that never existed or which didn’t say what GPT claimed they said.)

I concur, and that was my impression, too. The AI written version was not assessed for originality or accuracy, but only on superficial criteria of plausibility. AI is very good at generating plausible sounding text.

Chaucerburnt went on to look over the other two articles, which I hadn’t bothered to read.

The second article linked – which feels very much like it was itself written by GPT – makes a great many assertions about the ways in which GPT “can help” scientists in writing papers, but is very light on evidence to support that it’s good at these things, or that the time it saves in some areas is greater than the time required to fact-check.

It acknowledges plagiarism as a risk, and then offers suggestions on how to mitigate this: “When using AI-generated text, scientists should properly attribute any sources used in the text. This includes properly citing any direct quotations or paraphrased information”… – this seems more like general advice for human authors than relevant to AI-generated text, where the big problem is *not knowing* when the LLM is quoting/paraphrasing somebody else’s work.

It promotes the use of AI to improve grammar and structure – but the article itself has major structural issues. For instance, it has a subsection on “the risk of plagiarism” followed by “how to avoid the risk of plagiarism”.

But most of the content in “the risk of plagiarism” is in fact stuff that belongs in the “how to avoid” section.

Some of it is repeated between sections – e.g. each of those sections has a paragraph advising authors to use plagiarism-detection software, and another on citing sources.

On the grammatical side, it has a bunch of errors, e.g.:

“AI tools like ChatGPT is capable of…”

“The risk of plagiarism when use AI to write review articles”

“Use ChatGPT to write review article need human oversight”

“Conclusion remarks”

“Are you tired of being criticized by the reviewers and editors on your English writings for not using the standard English, and suggest you to ask a native English speaker to help proofreading or even use the service from a professional English editor?”

(Later on, it contradicts that by noting that “AI-generated text usually requires further editing and formatting…Human oversight is necessary to ensure that the final product meets the necessary requirements and standards.”)

If that paper is indeed written by GPT, it’s a good example of why not to use GPT to write papers.

The third aricle gets the same treatment.

The last of the three papers you linked is a review of other people’s publications about ChatGPT. It’s more of a summary of what other people are saying for and against GPT’s use than an assessment of which of these perspectives are well-informed.

(Of 60 documents included in the study, only 4 are categorised as “research articles”. The most common categories are non-peer-reviewed preprints and editorials/letters to the editor.)

It does note that 58 out of 60 documents expressed concerns about GPT, and states that despite its perceived benefits, “the embrace of this AI chatbot should be conducted with extreme caution considering its
potential limitations.”

Not exactly an enthusiastic recommendation for GPT adoption.

Going a step further, Chaucerburnt reassures me that my role in the class is unchallenged.

I’ve seen people use AI for critique, and my impression is that it does more harm than good there.

If a human reviewer tells me that my sentences are too long and complex, there’s a very high probability that they’re saying this because it’s true, at least for them.

If an AI “reviewer” tells me that my sentences are too long and complex, it’s saying it because this is something it’s seen people say in response to critique requests and it’s trying to sound like a human would. Is it actually true, even at the level that a human reviewer’s subjective opinion is true? No way to know.

Beyond that, a lot of it comes down to Barnum statements: https://medium.com/@herbert.roitblat/this-way-to-the-egress-barnum-effect-or-language-understanding-in-gpt-type-models-597c27094f35

Many authors can benefit from generic advice like “consider your target audience”, but we don’t need to waste CPU cycles to give them that.

This term I had a couple of student papers here at the end that would not have benefited from ChatGPT at all. Once a student gets on a roll, you’ll sometimes get sections that go on at length — they’re trying to summarize a concept, and the answer is to keep writing until every possible angle is covered. The role of the editor is to say, “Enough. Cut, cut, cut — try to be more succinct!” I’ve got one term paper that is an ugly mess at 30 pages, but has good content that would make it an “A” paper at 20 pages. ChatGPT doesn’t do that. It can’t do that, because its mission is to generate glurge that mimics other papers, and there’s nobody behind it that understands the content.

Anyway, sometimes social media comes through and you get a bunch of humans writing interesting stuff on both sides of an argument. I’d hate to see how ugly social media could get if AIs were chatting, instead.

Too much social media

Once upon a time there was Twitter, and it was fine. There was much to dislike about it, but it had the advantage of being the one central repository of all the chatter, for good and ill, and I coped with the badness by doing a lot of blocking.

Then it became “X,” and it was terrible and vile, and Elon Musk is a neo-Nazi idiot, so I left, cleanly and completely. That was a good decision on my part. So I started exploring the other social media options.

I got on Mastodon. It’s a bit clunky, and I still don’t understand some of the details, but I’m comfortable there. I like the diversity of content. Sometimes people are too weirdly judgmental, but it’s not my site, so I’ll adjust. It’s still on my recommended list.

I’m also on BlueSky, which is probably the most like the old Twitter. It’s more centralized than Mastodon, good ebb and flow of topics, and there’s actually a Science Bluesky. I’m sticking with it longer, we’ll see how it shapes up.

Then there’s Threads. I don’t know about Threads. It has a very different dynamic — people take the name literally, and there are a lot of threads, where they go on and on over multiple comments, and it’s beginning to bug me. Shouldn’t you just start a blog? People do write a lot, which is a positive. It’s a Zuckerberg production, which is a COLOSSAL NEGATIVE. I killed Facebook long ago, that was enough.

So, anyway, there can be only one, and I’ve decided to axe Threads. That means that in my head it is now a duel to the death between Mastodon and BlueSky.

Who else is on social media? What do you prefer? Don’t bother to tell me to abandon it all, I’m accustomed to my frequent tiny blips of interaction.

At least you quickly know to not bother reading the rest

Here’s a blatant example of AI polluting the scientific literature:

I, too, would like to know “How come this meaningless wording survived proofreading by the coauthors, editors, referees, copy editors, and typesetters?”

OK, typesetters are forgiven, it’s their job to print exactly what they are given, but the others? No sympathy. I think the answer is that there is so much trash poured out on their desks that their eyes glaze over and they end up rubber-stamping everything, because the alternative is madness. They’d have to read this junk.

The AI apocalypse is already here

I’m not alone in seeing how the internet has been degenerating over the years. The first poison was capitalism: once money became the focus of content, a content that was rewarded for volume rather than quality, the flood of noise started to rise. Then it was the “algorithm”, initially a good idea to manage the flow of information that was quickly corrupted to game the rules. SEO became a career where people engineered that flow to their benefit. And Google smiled on it all, because they could profit as well.

The latest evil is AI, which is nothing but a tool to generate profitable noise with which to flood the internet, an internet that is already choking on garbage. Now AI is beginning to eat itself.

Generative AI models are trained by using massive amounts of text scraped from the internet, meaning that the consumer adoption of generative AI has brought a degree of radioactivity to its own dataset. As more internet content is created, either partially or entirely through generative AI, the models themselves will find themselves increasingly inbred, training themselves on content written by their own models which are, on some level, permanently locked in 2023, before the advent of a tool that is specifically intended to replace content created by human beings.

This is a phenomenon that Jathan Sadowski calls “Habsburg AI,” where “a system that is so heavily trained on the outputs of other generative AIs that it becomes an inbred mutant, likely with exaggerated, grotesque features.” In reality, a Habsburg AI will be one that is increasingly more generic and empty, normalized into a slop of anodyne business-speak as its models are trained on increasingly-identical content.

After all, the whole point of AI is to create slop that will be consumed because it looks sorta like the slop that people already consumed. So make more of it! We’re in competition with the machines that are making slop, so we can outcompete them by just making more of it. It’s what biology would look like if there were no natural selection, and if energetic costs were nearly zero — we’d be swimming in a soup of goo. As Amazon has discovered.

Amazon’s Kindle eBook platform has been flooded with AI-generated content that briefly dominated bestseller lists, forcing Amazon to limit authors to publishing three books a day. This hasn’t stopped spammers from publishing awkward rewrites and summaries of other people’s books, and because Amazon’s policies don’t outright ban AI-generated content, ChatGPT has become an inoperable cancer on the body of the publishing industry.

That’s a joke. Limiting authors to three books a day? How about limiting it to one book a month, which is more in line with the human capacity to write? You know that anyone churning out multiple books per day is not investing any thought into them, or doing any real research, or even aspiring to quality. Amazon doesn’t care, they exist only to skim off a few pennies of profit off each submission, so sure, they’ll take every bit of hackwork you can throw at them. Take a look at the Kindle search page sometime — it’s nothing but every publisher’s slush pile amplified ten thousand fold.

The Wall Street Journal reported last year that magazines are now inundated with AI-generated pitches for articles, and renowned sci-fi publisher Clarkesworld was forced to close submissions after receiving an overwhelming amount of AI-generated stories. Help A Reporter Out used to be a way for journalists to find potential sources and quotes, except requests are now met with a deluge of AI-generated spam.

These stories are, of course, all manifestations of a singular problem: that generative artificial intelligence is poison for an internet dependent on algorithms.

The only algorithm I want anymore is “Did PZ Myers subscribe to this creator? Then show the latest from them.” I don’t want “X is vaguely similar to Z that PZ Myers subscribed to” and I sure as hell don’t want “Y paid money to be fed to everyone who liked Z”, but that is what we do get.

One hope is that all the AI-based companies will eventually start cannibalizing each other. That may have already begun: two AI image companies, Midjourney and Stability AI, are fighting because Stability skulked into the Midjourney database to snatch up as much of their data as they could.

Here’s a prompt for you: two puking dogs eating each other’s sick and vomiting it back up again, over and over.

The era of beautiful airplanes

When I was a young kiddo, up through high school, I had two passions: biology and airplanes. You can guess which one won out, but I still sometimes dream of flying. In those days, I’d bicycle out to one of the local airports — Boeing towns had no shortage of them — and just hang out at the chain link fence by the end of the runway, or bike around the hangars. It was a treat to take a long bike trip to the Museum of Flight, which at the time was a big hangar where people were reconstructing a biplane, but has since expanded into a magnificent complex with all kinds of planes.

I am suddenly reminiscing about this because YouTube randomly served up a video about one of my favorite old-timey airplanes, the P-26 Peashooter.

That great big radial engine, that lovely post-war color scheme, and it’s wearing pants! Before retractable landing gear became a must-have for any high performance plane, they were outfitted with aerodynamic coverings, which I find irresistibly charming. Planes from the 1930s hit a sweet spot for me, so this random video in which nothing really happens was something I had to watch. It’s an odd trigger that reminds me of being 15 years old again.

So why did I give up my fascination with planes? One factor was that I only learned in high school that I was extremely near-sighted, and needed glasses — that felt like discovering that I was broken, and nature was telling me that certain pathways were closed to me. I was also getting deeper and deeper into that scholarly stuff, reading constantly, which probably contributed to my optical failures. I still sometimes think it would be awesome to take flying lessons, except a) no time, b) no money, and c) age has taught me that there are many things that look easy, but actually require a great deal of skill and discipline to do well. Flying is one of those things that is unforgiving of dilettantes.

But still, those aircraft from the Amelia Earhart era give me a little tingle.

You couldn’t pay me to ride in a Tesla

Let alone buy one. They’re over-engineered and clumsily designed, as we can see in the example of this stupid, poinless death.

Angela Chao, Sen. Mitch McConnell’s billionaire sister-in-law, spent her last minutes alive frantically calling her friends for help as her Tesla slowly sank in a pond on a remote Texas ranch, according to a report.

Chao, the billionaire former CEO of dry bulk shipping giant Foremost Group, tragically died at the age of 50 on Feb. 10 after accidentally backing her car into the pond while making a three-point turn.

When the car lost power, she couldn’t get out while the car filled with water.

The windows are made of laminated glass, which sounds like a plus, but they’re so hard they aren’t easily broken. The doors are opened electronically, with a clever little button. There is a manual switch for the front doors, but they’re not obvious and you need to have read the manual to know about them. The manual switches for the back doors are buried in a very nonintuitive place, and further, owners are warned that using them too much can damage the finish.

Apparently, changing gears is done with an LED touch screen. Why? Multiple generations of Americans have been trained on simple levers and buttons that are familiar and reliable. There is a virtue to simplicity and obvious controls.

Manual controls are probably cheaper, too, but not as flashy.