Tag Archives: technology

Context Engineering

Andrej Karpathy posted an interesting question on X recently, asking how people used LLM chat bots. Specifically he was talking about making new chats for every question/purpose, vs. the One Thread approach. He discussed some trade-offs in both performance and fidelity when taking advantage of the large context windows now available to us. It’s really interesting and I recommend you read the whole post. But I was surprised to find that no one else who answered him described the method I use, and so I wrote it up, and it garnered a lot of interest, so I thought I’d flesh it out a bit further here:

I manage hundreds (thousands?) of conversations that fall into four groups:

  1. long-running, bookmarked – basically my staff
    3 examples:
    • I have an AI personal trainer/nutritionist I always return to for training/nutrition questions.
    • I have one conversation that helped me build my current home Linux box, and I return to it for any HW/OS/SW questions related to it.
    • I have several AI professors I use to learn various subjects – one per subject
  2. useful, may return to, but not necessarily
    examples:
    • I saw nice sweet potatoes at the grocery store – asked about sweet potato soup, made soup. A week later, I saw a nice pumpkin – wanted to make a similar soup. Remember that convo, which already knows my equipment and preferences, returned to that conversation for a different soup.
    • in general, if I think I’ve asked a question before, and the context from before will save me some time now, I use search to look at previous conversations, and might continue one of them rather than start a new one
  3. One-off questions: I usually ask them in a fresh conversation
  4. Truly throwaway questions. Not only do I start a fresh conversation, but I will usually archive/delete it when I’m done. This is when the subject is pretty trivial, and I view it as clutter.

Special case for some long-running conversations: I have also noticed that sometimes overly long context can start to produce weird effects (and Andrej describes a bit of why this happens). The LLM starts to hallucinate more, is less reliable about remembering details, and so on. In situations like this I sometimes ask it to generate a detailed summary of everything we have been working on, and I may ask follow-up questions, and then I paste the results into a new conversation and continue from there basically having transplanted the essentials from the older chat to the fresh one.

It is true that, by this point, the LLM is already getting a bit squirrely, so if the summary is missing anything important, I remind the LLM, and it’s enough to help it remember and then it add a summary of that too. It’s not that it forgets, it just has lost the thread on what is most relevant. But I can help, and so I repeat until I’m satisfied and only then do I feed it to the new chat. It’s not perfect, but I’m also not deleting the old chat so I can go back to it if necessary.

One of the followup questions I got was about how I go about organizing and finding the conversations I want in the midst of the thousands I have. That works like this:

Both Grok & ChatGPT make it relatively easy. You can rename chats, so for my “staff,” that is, my long-running chats-with-a-purpose that I return to, I rename them like,

** Personal Trainer **
** Nuclear Engineering Professor **
** Productivity Coach **

which makes them easy to pick out of the list.

Grok even lets you bookmark specific chats. And, if one of my AI staff has scrolled way down because I haven’t talked to it in a while, it will pop right up when I search for it because it has a good and memorable name.

I used to create browser bookmarks too, because each chat has its own URL, but I find that’s not needed and so I stopped.

As for the remaining conversations, I’m less worried about needing to find them. Like in my soup recipe example, I just search for “soup” and 13 conversations pop up, and I reopen the one that is the most relevant to what I want to do now.

Hope you found this as helpful and let me know on X (I don’t ever look at the comments here), or if you have a better method I’d love to hear it too.

History Was Made

On October 13th, 2024, humanity took a giant leap toward the future of space exploration. On that Sunday, I joined countless others in watching the fifth integrated test flight of Starship—IFT-5. During the flight (and immediately after), my part of Twitter/X exploded with awe-inspiring images and reactions. But the next day at work, few of my colleagues seemed to grasp the magnitude of what had happened. Some space enthusiasts shared my excitement, but most were only dimly aware that anything significant had taken place.

If you are one of those, you can be forgiven. Most people aren’t aware of the significance of the event, but, believe me, it was historic. Humanity crossed a very important line on October 13th, 2024. It was the moment we transitioned from asking if full reusability was possible to knowing it is.

To explain:

SpaceX has a launch tower called Mechazilla, with huge chopstick arms that lifts giant rockets, called the Super Heavy booster, onto the launch mount, and then lift the giant second stage, called Starship, onto the Super Heavy. Super Heavy is the most powerful rocket ever made, and the combined stack is not only the largest rocket ever built, it’s the largest flying object ever to exist.

On Sunday, SpaceX launched this monster for the fifth time. Super Heavy lifted the whole stack off the launch mount (SpaceX doesn’t use a pad for Starship). The crowds watching cheered with delight to see the incredible column of purple-white fire from the combined might of 33 Raptor 2 rocket engines. Mach diamonds the size of tour buses. Shock waves visibly rippling outward—unreal. Of course to onlookers it’s totally silent until suddenly it’s not, with the rocket already climbing into the sky before the deep crackle of thousands of tons of supersonic superheated gases smacking into still air reaches them. Within a minute of liftoff, the rocket was already traveling faster than the speed of sound. In a tradition we love because of the incredible views, SpaceX has a number of cameras all over both Ship and Booster, which stream HD video live over Starlink, and, with a dawn launch, we were treated to seeing the exhaust plume turn blue-green as the atmosphere thinned out.

At about that moment, having lifted Starship well above the stratosphere, the Super Heavy shut down all but three engines as Ship ignited all six of its engines to pull away from the booster. Super Heavy immediately flipped around, reigniting 13 of its engines to slow its forward momentum. After less than a minute, it had reversed direction and put itself in a high arc to return to Starbase, while Starship continued the rest of its climb to space and accelerating downrange to reach a speed within a hair’s breadth of orbital velocity. (This is intentional, so that it will for sure re-enter at the planned location, and is not dependent on a de-orbit burn.)

For a moment, all is peaceful. Both vehicles are so high there is no atmosphere to give a sense of the tremendous velocities at play. But Super Heavy is not in orbit, and is falling like the gigantic stainless steel cylinder that it is. Unlike Falcon 9 boosters, it skips a second burn to ease re-entry, coming in hot instead. Literally so. The tracking cameras pick it up dropping back down from space at three times the speed of sound, heading back to the launch tower. The bottom glowing red hot as it sheds speed punching through the thickening atmosphere. A kilometer in the air, still moving over 1000 km/h, it again relit 13 of its engines to slow to merely 200 km/h. The incoming shock wave is enormous and visible in the clouds and as it slams into the cryo-mist at Starbase. Multiple sonic booms are felt as much as they are heard.

Then, using only 3 engines, this 25 story beast slowed still further and brought itself gently into Mechazilla’s loving embrace. Mechazilla in turn closed its chopstick arms and caught the rocket in midair.

It was at this point that the world’s space-loving fandom lost its collective mind.

We knew what we had just seen. The pre-dawn glimmer of a second dawn of the space age.

There’s still a long way to go, but this was the last thing we needed to demonstrate in order to know that full re-usability is doable. To see it succeed, on the first attempt, was an indication that the rest of the remaining hurdles are just n iterations away from a solution, and that SpaceX is phenomenally good at iterating. Everything they build, they build more of it than anyone else – more rockets, more rocket engines, more satellites, more antennas than any other company, and in less time.

And the mission wasn’t over. Starship blazed through re-entry with minor damage and aced its landing, right on target, in full view of the camera buoys SpaceX had placed ahead of time. But, though we lost it again at the glorious views of the flip maneuver, we kind of knew that would happen. That was old news. The third one made it to re-entry, the fourth one made it to a lovely soft landing on a flap and a prayer and 6 km off target, so we knew the fifth one was going to do better. Note: SpaceX intends to push some limits on Starship for flight 6, so we’ll see, but on the other hand, the flight profile will have it come down when it is daylight on the Indian Ocean

Hours later (none of us were calm yet—we were all still freaking out at each other), in a move I didn’t see coming at all, Mechazilla lowered Booster down onto the launch mount and reconnected it to the fueling lines. The calm precision of this operation was almost surreal, considering the booster had just done the nearly impossible and was still bruised and battered from its extreme experience. What’s more, people didn’t even know you could remount a booster without the launch pins to line things up. Clearly, they are serious about working on rapid re-use.

This was SpaceX’s Apollo 4 moment, when the Saturn V first proved it could leave Earth under the power of the most powerful rocket ever built. This flight crossed a threshold where the greatest unknown—whether full reusability could truly be achieved—became a known. From here, it’s hard work and iteration, but starting Sunday, we all saw humanity take a huge step toward becoming a spacefaring civilization.

What would you say ya do here?

I volunteered for the career day for the 11th and 12th graders at my son’s school and they asked me to explain what I do:

More specifically, they asked me to “please provide a brief job description and list the most important aspects of your current job. This will help our students understand what you do on a daily basis.”

Wanting to be completely honest with these kids who are about to try to pick a school, pick a major, figure out a career, I sent them this:

 

I lead a plucky team of data scientists, engineers, and analysts in finding undeclared nuclear R&D around the world. We built/bought/integrated the software, begged/borrowed/took the data, fused it together into something that can swing a billion data records at our most difficult questions, and trained people in how to wield the tools we’d built.

Disciplines involved:
– large-scale data analytics
– information modeling
– programming
– machine learning
– information visualization
– persuasion
– patience
– impatience
– not knowing when to quit

What would you say?

Code.org

My story?

My dad bought a TRS-80 Color Computer when I was about 5. I didn’t learn to code, but I saw a modem, heard binary being played on our cassette drive, and learned what a kilobyte is.

Later, I learned Logo and BASIC when I was 8 and 9. Just very simple toy programs. I learned more sophisticated programming in Pascal in high school. I did have books, and Dad got me started, but my schools’ programming classes get at least half the credit.

I started getting paid to work with computers while still in high school. I have made money ever since from working with computers. Even the years I taught ballroom dancing full-time, I wrote software part-time and brought in new revenue at the studio by setting up the website and our first online sales of gift certificates.

Today I live in Vienna and manage a significant software project at the International Atomic Energy Agency. As a job, it’s amazing, and the work is important. I’m writing this from a lovely apartment in Venice where I’m vacationing with my family while the team works without me.

It’s a good life, and I’m incredibly grateful. I wouldn’t be here if I didn’t know how to code.

But it’s not just about the good work you can do and the good life you can have. It’s fun. The things we can do now with software are amazing. A programmer in the 80s would be awed by what’s possible to coders now. It’s not just faster computers, it’s the fact that so much of the world is now online. Take something simple like flight bookings: they were computerized in the 80s (probably earlier), but in closed systems. Today, there are so many ways to tie that information together that travel booking sites abound, and the best ones are so good that we can be near-omniscient about our options. We think little of booking, from our couch, vacations with airlines and hotels we’ve never heard of.

Coders regularly produce apps which do things that weren’t possible a few years ago. My phone (an anachronistic name for the hyper-connected supercomputer I carry in my pocket) can augment my reality in countless ways, but the latest is holding it up and looking through it so that all the Italian writing is replaced with English.

What’s next? Imagine writing code to do this:

  • social apps that allow you to point your finger and write in the sky…where all your friends can see it through their glasses or contact lenses.
  • designing toys and selling them online where buyers click to print them out on their 3D printers
  • building the apps to do the designing I just mentioned or building the site to broker the transactions
  • writing code to control swarms of tiny flying/crawling robots to…well, frankly the first of these will all have military or intelligence applications which may appeal to some, but, after that, there will be plenty of environmental and scientific uses.

We Live In the Future

Dan dressed as Neo from The MatrixI’m on a train from Washington, DC to New York, currently passing through Philadelphia. We’ll be at New York’s Penn Station in 90 minutes. I just looked up from the book I’m reading on my iPhone and saw a building with a sign on it: Penn Proton Therapy Center. Now I’m writing a blog entry on my iPhone. I don’t feel like spending a couple of dollars on 3G access (I live in Europe, so I’m roaming here) and WiFi hasn’t been installed on this train yet, so I’m writing this in the Notes app instead of directly to my blog, which is hosted in a data center in…er…I have no idea.

Stop and read that again. Only, this time, pretend you are the average human. Remember that the average human does not have access to the Internet and can’t get to this blog. In fact, the average human lacks running water.

Continue reading