• Teb's Lab
  • Posts
  • The Bitter Lesson, Moore’s Law, and a Trade War With China

The Bitter Lesson, Moore’s Law, and a Trade War With China

News: AI in malicious hands

The Weekly Lab Report

I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news.

If you’re new to the Lab Report you can subscribe here.

If you like what you’re reading you’ll love one of our classes. Schedule a training from our catalog or request a custom class consultation.

From The Lab

Today I’ll be finishing up the public-facing outline for my upcoming Python class with DevSprout. Several of you indicated interest, so I’ll share that next week.

This week I taught an Intermediate SQL course that was a ton of fun, but also took most of my time and energy. I’m also scheduled in the classroom for 32 hours next week. As a result this week and next week’s newsletters will both be shortened.

Relatedly, we’re hoping to hire a part-time writer to help avoid this situation in the future. If you or someone you know wants to help me write this newsletter, drop me a line at [email protected].

Today’s Lesson

The Bitter Lesson, Moore’s Law, and a Trade War With China

An underappreciated fact about the machine learning revolution is that its fundamentally about computer hardware. The family of models taking the world by storm — neural networks — were first described in a paper in 1944. At the time Neural networks failed to do anything interesting just way too slow.

Nevertheless, since about 2014 it’s been popular to describe the progress in AI as an “exponential trend” that may soon result in hyper-intelligent AI’s going full Skynet. And it’s true — progress in AI has been exponential. But it’s not the algorithms that’s getting exponentially better, it’s the hardware they run on. Specifically, the number of transistors we can fit on a given area of an integrated circuit has been growing exponentially since the 1960’s.

The exponential growth of transistor count since 1971. Source: https://bjc.edc.org/bjc-r/cur/programming/6-computers/2-history-impact/2-moore.html

This observation, first made by Gordon Moore in 1965, became a guiding principle of computer chip development: The number of transistors that we could fit on an integrated circuit would double roughly every 18 months. This phenomenon is called Moore’s Law and the computer chip manufacturing industry has kept Moore’s Law alive for nearly six decades through incredible feats of engineering such as Extreme Ultraviolet Photolithography (EUV).

In 2019, long time AI researcher Richard Sutton wrote a short paper called “The Bitter Lesson.” Here’s the opening of that paper:

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore's law, or rather its generalization of continued exponentially falling cost per unit of computation.

Richard Sutton (emphasis mine)

Once upon a time AI researchers approached their work with the assumption that clever algorithms could effectively encode something like human level “understanding” or “intelligence” in a given domain. But Sutton demonstrates in his paper that time and time again, in every domain, “brute force” methods have ultimately come out on top.

This has been true for as long as Moore’s Law has held up, but the chip industry is hurdling towards a few physical limitations that will eventually bring the 60-year “law” to an end, and perhaps soon.

The End of Moore’s Law?

Moore himself once commented, “These are fundamentals I don’t see how we [will] ever get around.”

The fundamentals he’s referring to are: heat, size, and the speed of light.

Heat

Until the early 2000s CPU speeds doubled along with transistor count. Smaller transistors meant electricity was traveling a smaller distance during a single cycle of computation, which allowed computer engineers to increase the clock speed. However, Increasing a CPU’s clock speed also makes it hotter. In about 2005 high-end CPUs started running hot enough to melt critical components of the CPU.

In fact, chip manufactures have all but stopped trying to increase clock speeds (called frequency in the chart below) and started adding more cores instead. Since then high performance computing has increasingly been all about parallelism.

Fun software fact: The main reason Transformers have replaced Recurrent Neural Networks (RNN) as the state of the art in natural language processing is because Transformers are optimized for parallel computing and RNNs are not which means transformers effectively capture the value of these additional cores while RNNs cannot.

This graph shows various how features of high-end computer hardware has changed over time. Source: https://www.researchgate.net/figure/48-Years-of-Microprocessor-Trend-Data-2_fig1_358572677

Size

The current state of the art transistor size is 2 nanometers — roughly the width of 10 carbon atoms. 2nm transistors have not been commercialized yet, but The Taiwan Semiconductor Manufacturing Company (TSMC) says they’ll open the first fab producing 2 nm chips by 2025.

2nm is an incredible feat, but we may not be able to go much further. Assuming the size of a transistor keeps halving, we only have 3 generations before transistors are the size of a single carbon atom. A single-atom transistor actually HAS been created, way back in 2012, it’s just not suitable for making computers: the 2012 single-atom transistor has to be held at negative 196° C to function. Another was invented in 2020 which has it’s own commercialization issues.

Any theoretical subatomic transistor would be a member of the quantum realm. I won’t pretend to be an expert in quantum physics, but the gist is that subatomic sized transistors will be subject to Heisenberg’s uncertainty principle. If that happens we won’t be able to reliably measure the state of the transistor, rendering them useless for computation. If transistors continue to shrink according to Moore’s Law then we’ll hit this quantum limit by 2036.

The Speed of Light

The above factors might not matter if something could travel infinitely fast. Instead of making computers faster by making components smaller, we could just make the information travel faster. We’ll probably get a bit more computational power by switching to light based computers which transmit information faster than electricity based computers. But if the speed of light is indeed a hard limit in this universe, then CPU speeds will be subject to it.

What Does This Have to do China?

Basically, the Biden Administration has learned The Bitter Lesson and is applying it in their increasingly adversarial approach to China. Biden views AI as a major nation security issue, and his administration is trying to limit China’s ability to advance in that domain. The result is an ongoing trade war centered on computer chip production capabilities.

This particular battle began in earnest in October 2022 when the US Bureau of Industry and Security issued a 139 page document which, among other things, expanded export controls. The controls were specifically targeted to limit China’s ability to acquire or build the high-end computer hardware user to train ML systems. Things like EUV Lithography technology and high-end chips made with EUV are among the controlled technologies.

China has retaliated, for example by creating their own export controls on gallium and germanium, which are important raw materials for making computer chips, fiber optics, and related technologies. China also ramped up it’s production of “legacy” chips, which are less powerful; fabricated using different processes; and still very important in computer hardware writ large — but not for state of the art AI research.

Another layer to this story is that the industry leader, The Taiwan Semiconductor Manufacturing Company (TSMC), is located in Taiwan. Tensions between China and Taiwan are high, and Biden has previously indicated that his administration would defend Taiwan from a Chinese invasion if it ever came to that. The TSMC fabs are such an important strategic asset that some war-gamers have suggested that the U.S. would destroy them before allowing China to take control.

We had classes Monday-Thursday. We’re cutting the News Quiz this week because we ran out of time.

Themes in the News

Commodification of Generative Models is Increasing Malicious Uses of AI

From text to audio to images to video, ML models are increasingly able to generate impressive, useful, and realistic content. These models have recently become commodified via open source, pre-trained models, APIs, and other consumer interfaces which dramatically lower the barrier to entry for deploying AI systems.

While there are many useful, cool, and pro-social uses of this technology, we’re witnessing a significant rise in malicious or otherwise unsavory uses, too.

Versions of the GPT architecture trained specifically to create malware, execute phishing attacks, and defraud people have started popping up. In terms of generative cybersecurity there’s enough “there” there for DARPA to get involved: they announced a two-year “AI Cyber Challenge” with nearly $20 million in prize money, and an additional $7 million as startup cash for small businesses who want to participate.

NAO Medical—which runs a series of medical clinics in New York—was caught using large language models to write nonsense articles in order to boost its SEO and rank higher on search engines. It appears they’ve since taken these articles down, but a copy of the article titled “Derek Jeter Herpes Tree: Causes, Symptoms, and Treatment” is retained in the linked Time article.

In a creepy and distasteful—although perhaps not exactly malicious—example some TikTok creators used generative models to recreate the likeness deceased or missing children and have those children tell their stories. Their works include a video where Anne Frank’s likeness first tries to sell you baby clothes then tells Anne Frank’s actual story.

These so called “deepfakes” are popping up in an array of unsavory uses. They’re being used to imitate the likeness of people (mostly women) who wouldn’t otherwise appear in pornography, with major implications for their reputation and privacy.

Audio deepfakes have been used in multiple instances of bank fraud. For example, scammers will convince a bank teller they are someone with authority over the phone by faking the authorities voice. Then they ask said teller to execute a wire transfer.

In general, generative AI tools are making it much easier and faster to create spammy content, accelerating the creation of all sorts of scams and click farming.

Teb’s Tidbits

Remember…

The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for anyone to use for any purpose. We’re also a very small team with no investors.

Help us keep providing these free services by scheduling one of our world class trainings or requesting a custom class for your team.

Reply

or to participate.