I saw this interesting paper linked on Bluesky
The reanimation of pseudoscience in machine learning and its ethical repercussions
It is from Patterns Volume 5Issue 9, September 13 2024, and talks about the harms of ML throughs its promotion of pseudo-science, or as the paper states:
The bigger picture
Machine learning has a pseudoscience problem. An abundance of ethical issues arising from the use of machine learning (ML)-based technologies—by now, well documented—is inextricably entwined with the systematic epistemic misuse of these tools. We take a recent resurgence of deep learning-assisted physiognomic research as a case study in the relationship between ML-based pseudoscience and attendant social harms—the standard purview of “AI ethics.” In practice, the epistemic and ethical dimensions of ML misuse often arise from shared underlying reasons and are resolvable by the same pathways. Recent use of ML toward the ends of predicting protected attributes from photographs highlights the need for philosophical, historical, and domain-specific perspectives of particular sciences in the prevention and remediation of misused ML.
The problem is simply put that the people responsible for the ML, cannot evaluate the data they feed into the ML. Or as the paper explains:
When embarking on a project in applied ML, it is not standard practice to read the historical legacy of domain-specific research. For any applied ML project, there exists a field or fields of research devoted to the study of that subject matter, be it on housing markets or human emotions. This ahistoricity contributes to a lack of understanding of the subject matter and of the evolution of methods with which it has been studied. The wealth of both subject-matter expertise and methodological training possessed by trained scientists is typically not known to ML developers and practitioners.The gatekeeping methods present in scientific disciplines that typically prevent pseudoscientific research practices from getting through are not present for applied ML in either industry or academic research settings. The same lack of domain expertise and subject-matter-specific methodological training characteristic of those undertaking applied ML projects is typically also lacking in corporate oversight mechanisms as well as among reviewers at generalist ML conferences. ML has largely shrugged off the yoke of traditional peer-review mechanisms, opting instead to disseminate research via online archive platforms. ML scholars do not submit their work to refereed academic journals. Research in ML receives visibility and acclaim when it is accepted for presentation at a prestigious conference. However, it is typically shared and cited, and its methods built upon and extended, without first having gone through a peer-review process. This changes the function of refereeing scholarship. The peer-review process that does exist for ML conferences does not exist for the purpose of selecting which work is suitable for public consumption but, rather, as a kind of merit-awarding mechanism. The process awards (the appearance of) novelty and clear quantitative results. Even relative to the modified functional role of refereeing in ML, however, peer-reviewing procedures in the field are widely acknowledged to be ineffective and unprincipled.7879 Reviewers are often overburdened and ill-equipped to the task. What is more, they are neither trained nor incentivized to review fairly or to prioritize meaningful measures of success and adequacy in the work they are reviewing.This brings us to the matter of perverse incentives in ML engineering and scholarship. Both ML qua academic field and ML qua software engineering profession possess a culture that pushes to maximize output and quantitative gains at the cost of appropriate training and quality control. In most scientific domains, a student is not standardly expected to publish until the PhD, at which point they have typically had at least half a decade of training in the field. Within ML, it is now typical for students to have their names on several papers upon exiting their undergraduate. The incentives force scholars and scholars in training to churn out ever higher quantities of research. As limited biological agents, however, there is a bottleneck on time and critical thought that can be devoted to research. As quantity of output is pushed ever higher, the quality of scholarship necessarily degrades.The field of ML has a culture of obsession with quantification—a kind of “measurement mania.” Determinations of success or failure at every stage and level are made quantitatively. Quantitative measures are intrinsically limited in how informative they can be—they are, as we have said, only informative to the extent that they are lent content by a theory or narrative. Quantitative measure cannot, for instance, capture the relative soundness of problem formulation. It has been widely acknowledged that benchmarking is given undue import in the field of ML and, in many cases, is actively harmful in that it penalizes careful theorizing while rewarding kludgy or hardware-based solutions.78A further contributing factor is the increased distribution of labor within scientific and science-adjacent activities. The Taylorization or industrialization of science and engineering pushes its practitioners into increasingly specialized roles whose operations are increasingly opaque to one another. This fact is not intrinsically negative—its repercussions for the legitimacy of science can be, when care is taken, a net positive. In combination with the other facets already mentioned, however, it can cause a host of problems. Increasingly, scholars and industry actors outsource the collection and labeling of their data to third parties. When—as we have argued—much of the theoretical commitments of a modeling exercise come in at the level of data collection and labeling, offloading these tasks can have damaging repercussions for the epistemic integrity of research.All of the above realities work alongside a basic fact of modern ML: its ease of use. With data in hand and the computing power necessary to train a model, it is possible to achieve publishable or actionable results with a few hours of scripting and write-up. The rapidity with which such models are able to be trained and deployed works alongside a lack of gatekeeping and critical oversight to ill effect.
In my opinion, the paper makes the case for a new process, where people who actually knows the field are part of vetting the data given to the ML model.
… where people who actually knows the field are part of wetting the data given to the ML model.
I don’t think this sort of vetting will ever be a common practice in ML.
Oops – didn’t realize that I had made a spelling mistake – changed wetting to vetting.
And I unfortunately think you are right. We have known about issues with data quality in ML models for the longest time, but it doesn’t seem like we are progressing
This is an age-old problem. GIGO.
This is why good scientific journals have reviewers. Unfortunately, with the huge number of papers being churned out, and the huge number of journals, reviews are often skimpy, and frequently the reviewers aren’t all that good or simply okay stuff without really reading the articles.
It crops up in other places. There are systems to predict how likely someone is to have committed a crime, based on circumstantial stuff, but since African-Americans are frequently arrested largely because of their race, those systems will frequently rate African-Americans as more suspect simply based on their race.
And people who get most of their information about the world from social media ….