Author Archives: admin

History Was Made

On October 13th, 2024, humanity took a giant leap toward the future of space exploration. On that Sunday, I joined countless others in watching the fifth integrated test flight of Starship—IFT-5. During the flight (and immediately after), my part of Twitter/X exploded with awe-inspiring images and reactions. But the next day at work, few of my colleagues seemed to grasp the magnitude of what had happened. Some space enthusiasts shared my excitement, but most were only dimly aware that anything significant had taken place.

If you are one of those, you can be forgiven. Most people aren’t aware of the significance of the event, but, believe me, it was historic. Humanity crossed a very important line on October 13th, 2024. It was the moment we transitioned from asking if full reusability was possible to knowing it is.

To explain:

SpaceX has a launch tower called Mechazilla, with huge chopstick arms that lifts giant rockets, called the Super Heavy booster, onto the launch mount, and then lift the giant second stage, called Starship, onto the Super Heavy. Super Heavy is the most powerful rocket ever made, and the combined stack is not only the largest rocket ever built, it’s the largest flying object ever to exist.

On Sunday, SpaceX launched this monster for the fifth time. Super Heavy lifted the whole stack off the launch mount (SpaceX doesn’t use a pad for Starship). The crowds watching cheered with delight to see the incredible column of purple-white fire from the combined might of 33 Raptor 2 rocket engines. Mach diamonds the size of tour buses. Shock waves visibly rippling outward—unreal. Of course to onlookers it’s totally silent until suddenly it’s not, with the rocket already climbing into the sky before the deep crackle of thousands of tons of supersonic superheated gases smacking into still air reaches them. Within a minute of liftoff, the rocket was already traveling faster than the speed of sound. In a tradition we love because of the incredible views, SpaceX has a number of cameras all over both Ship and Booster, which stream HD video live over Starlink, and, with a dawn launch, we were treated to seeing the exhaust plume turn blue-green as the atmosphere thinned out.

At about that moment, having lifted Starship well above the stratosphere, the Super Heavy shut down all but three engines as Ship ignited all six of its engines to pull away from the booster. Super Heavy immediately flipped around, reigniting 13 of its engines to slow its forward momentum. After less than a minute, it had reversed direction and put itself in a high arc to return to Starbase, while Starship continued the rest of its climb to space and accelerating downrange to reach a speed within a hair’s breadth of orbital velocity. (This is intentional, so that it will for sure re-enter at the planned location, and is not dependent on a de-orbit burn.)

For a moment, all is peaceful. Both vehicles are so high there is no atmosphere to give a sense of the tremendous velocities at play. But Super Heavy is not in orbit, and is falling like the gigantic stainless steel cylinder that it is. Unlike Falcon 9 boosters, it skips a second burn to ease re-entry, coming in hot instead. Literally so. The tracking cameras pick it up dropping back down from space at three times the speed of sound, heading back to the launch tower. The bottom glowing red hot as it sheds speed punching through the thickening atmosphere. A kilometer in the air, still moving over 1000 km/h, it again relit 13 of its engines to slow to merely 200 km/h. The incoming shock wave is enormous and visible in the clouds and as it slams into the cryo-mist at Starbase. Multiple sonic booms are felt as much as they are heard.

Then, using only 3 engines, this 25 story beast slowed still further and brought itself gently into Mechazilla’s loving embrace. Mechazilla in turn closed its chopstick arms and caught the rocket in midair.

It was at this point that the world’s space-loving fandom lost its collective mind.

We knew what we had just seen. The pre-dawn glimmer of a second dawn of the space age.

There’s still a long way to go, but this was the last thing we needed to demonstrate in order to know that full re-usability is doable. To see it succeed, on the first attempt, was an indication that the rest of the remaining hurdles are just n iterations away from a solution, and that SpaceX is phenomenally good at iterating. Everything they build, they build more of it than anyone else – more rockets, more rocket engines, more satellites, more antennas than any other company, and in less time.

And the mission wasn’t over. Starship blazed through re-entry with minor damage and aced its landing, right on target, in full view of the camera buoys SpaceX had placed ahead of time. But, though we lost it again at the glorious views of the flip maneuver, we kind of knew that would happen. That was old news. The third one made it to re-entry, the fourth one made it to a lovely soft landing on a flap and a prayer and 6 km off target, so we knew the fifth one was going to do better. Note: SpaceX intends to push some limits on Starship for flight 6, so we’ll see, but on the other hand, the flight profile will have it come down when it is daylight on the Indian Ocean

Hours later (none of us were calm yet—we were all still freaking out at each other), in a move I didn’t see coming at all, Mechazilla lowered Booster down onto the launch mount and reconnected it to the fueling lines. The calm precision of this operation was almost surreal, considering the booster had just done the nearly impossible and was still bruised and battered from its extreme experience. What’s more, people didn’t even know you could remount a booster without the launch pins to line things up. Clearly, they are serious about working on rapid re-use.

This was SpaceX’s Apollo 4 moment, when the Saturn V first proved it could leave Earth under the power of the most powerful rocket ever built. This flight crossed a threshold where the greatest unknown—whether full reusability could truly be achieved—became a known. From here, it’s hard work and iteration, but starting Sunday, we all saw humanity take a huge step toward becoming a spacefaring civilization.

On the meaning of acceptance

All the greatest wisdoms come down to acceptance. It is key.
Accept:
… the present moment.
… your situation.
… how you feel.
… each other.

And, yet, our language somehow frames this as something passive.
I’ve written before, “accept, then act,” and this is right, but let’s look at what we are accepting, and how we are acting.

I accept that our universe is full of limitless resources. Energy, material, and time. Even the material contains more energy. Every single atom is full of immense energy, which can be released and directed. I accept that I have intelligence and knowledge, and access to enormous amounts more of both. I accept that I have the love and support of some of the smartest, kindest, and wisest people on the planet. I accept that I have a body which is strong and healthy and genetically blessed.

I will act accordingly.

Bing Chat has an attitude!

Bing chat is not above ending the chat when it paints itself into a corner.

If you don’t know the scene, it’s a classic: https://youtu.be/5oWNQSCPWy4

In an effort to get access to the new ChatGPT-4 features, I’ve started using the Bing iOS app. And, while it’s cool in some ways, I have found that wrapping GPT4 in Bing Chat often feels like they’ve wrapped Leonardo da Vinci in Chandler Bing.

My adventures started yesterday evening, in a way that has nothing to do with the GPT-4 features, but seem oddly relevant to what happened today, so I’ll describe it briefly:

I am an American ex pat living in Vienna. So my Bing Chat has my location as Vienna, my region as Austria, and my language as English. But all its notifications about breaking and local news, weather, whatever, are in German. It’s irritating because my German is not great, so I sent feedback on this through the app.

Adventures in font identification

Flash forward to this morning. Having another tech issue that has nothing to do with GPT-4, I had decided to write a post about the problems I’m experiencing with iOS 17. As part of this, I wanted to make a parody image of translation app, changing the text of a button. I decided to upload a picture of the button and ask GPT-4 if it could identify the font. OpenAI hasn’t rolled multi-modal input out to my ChatGPT Plus account yet, so Bing Chat was my only option. I gave it the image, and asked it “What font is this?”

It searches for “German button font,” which is not what I wanted, but it’s an understandable action from the AI. And, though the response does seem to include some kind of interpretation of the style of font, it is also clear that the search it performed also informed its answer, because it mentioned fonts used in central Europe, and informed me as to where I could download Old German fonts. So, as I am accustomed to having fruitful interactions with ChatGPT when I explain how its response differs from what I’m looking for, I explained: “Don’t search for German fonts, the word in the image is irrelevant, I just want to know what font is used by the way it looks.”

Denial isn’t just a river in Egypt

Bing Chat straight up denied that it searched for German fonts, and said it only used the style of the text to identify it. I respond, “Yes, you did. You searched for ‘German button font,’ and though your response might have included information based on the style of the text, it also mentioned downloading Old German fonts. I just want you to identify the font by how it looks, unbiased from the search for the word in the image.”

In response, Chandler Bing Chat pretended the connection was poor and ended the chat. I’m serious, I’m not joking: it said I had a network connection problem, and that I would have to restart the chat.

Second attempt

Now, at this point, I actually thought the network connection issue was real. But, in a fit of anthropomorphic pique, I started the new chat like this:

This time it didn’t say anything about a network connection, but it did end the chat immediately. Also, what the heck is with responding in German!? It had never done that before! I’d used it a number of times over the last several days, and Bing Chat never wrote to me in German – the German notifications I was complaining are part of the Bing iOS apps other features – not the chat. But now you see why I mentioned them in the beginning.

Hm, ok, well, fine, it’s offended now. I took the screenshots you see above (I wish I’d done so for the previous chat, but I’ll show you a recreation at the end of this article), and created a new chat.

Third time’s the…nope

This time I wanted to get back to my original question, and wanted to prevent the problem from before, so I uploaded the image, and asked it to identify the font in the image, and explicitly told it that the text in the image is irrelevant.

It said it was analyzing the image, and also searched for “font identification tool.” This time its response had nothing to do with the image – it just gave me a list of font identification cools, and then set the title of the chat to “Identificación de fuentes.” Where the heck did the Spanish come from?!

I’ve flustered it so much it is speaking in tongues!

Bing gets an attitude

Ok, at this point I decide I have to document this nuttiness, and I start this article. Because I missed out on the screenshot of the first chat, I try to recreate it as close as I can – the results are hilarious:

My analysis is that Microsoft, stung by bad press from when Bing chat professed love for a tech journalist and asked him to end his marriage, has put such strong guardrails in place that it causes the current version to be overcautious.

When AIs compare notes

It also doesn’t seem to have incorporated behind-the-scenes instructions to the ChatBot to tell it how it is working. Not sure what I’m talking about? This is something I also discovered recently. In ChatGPT Plus web interface, I have access to DALL-E 3, and it looks like this:

Now, my iOS ChatGPT app doesn’t have access to DALL-E 3 yet, but it is still possible to open the same chat via the history, and it looks like this:

Notice that’s a response from DALL-E 3 to ChatGPT. It’s not intended to be read by me, which is why it wasn’t visible to me when I first did it. But it is clearly explaining something to ChatGPT so that it doesn’t act oddly to me. And, just as clearly, Bing Chat doesn’t have something like that, and so it is left to its own devices when the app does something without the chatbot knowing.

At this point I’m worried I’m going to get my account suspended for daring to argue with the AI, but, well, this is the world now – when we snarl at our computers because they’re not behaving the way we want them to, they can argue back and then become passive-aggressive.

Why Clearing Your Phone’s Cache Might Not Be the Speed Boost You’re Looking For


It’s 2023, and if I see one more “Clear your cache for a faster browsing experience!” article, I might just throw my router out of the window. Seriously, the persistence of this piece of advice is baffling to me. And if we’re being honest, this advice isn’t just antiquated—it’s misleading.

Caches: A Brief Refresher

Browser caches store website data so that when you revisit a site, it can load faster by pulling locally stored data rather than redownloading everything from the server. This mechanism isn’t a new concept; it’s foundational to how web browsers have functioned for decades.

The Myth of the Slow Cache

Enter the narrative that a “mature” cache (one that’s been accumulating data for a while) will slow down your browsing. Picture caches as attics full of junk, needing occasional spring cleaning.

But let’s get technical for a moment:

  1. Cache Lookup Time: While this is generally faster than fetching from the network, a disorganized cache could, theoretically, introduce delays.
  2. Stale Cache Data: Old and outdated data might break website functionality if not properly validated. But, a modern browser’s validation system ensures that this is rarely a problem.
  3. Cache Pollution: Rarely visited website data taking up cache space? Maybe, but today’s browsers are designed to manage and prioritize cache effectively.
  4. Mismatched Cached Resources: While development changes can lead to outdated cache versions, browsers today are optimized to handle such mismatches with grace.

The perennially posted cache and cookie-clearing articles, if they bother to actually provide any technical basis, might mention these as potential pitfalls as justification, but they are also precisely what browser developers have spent years refining. Modern browsers are optimized to handle these scenarios efficiently, ensuring users receive the fastest and most reliable browsing experience.

Seriously, I’m picturing a browser development team lead scheduling an urgent meeting, “Everybody! I just read an article on CNET, and we have a big problem with caching we have to fix! Thank goodness the article keeps getting posted, or I might have missed it!”

Should We Automate Cache Cleaning?

If clearing the cache was this panacea, why wouldn’t operating systems just automate the process? An automated monthly clear-out, maybe? But they don’t, do they? Maybe because doing so doesn’t lead to the performance boosts that some claim?

If It Ain’t Broke…

Advising users to routinely clear their cache is like suggesting they defragment their SSDs—a relic from a bygone tech era.

We are in an age in which mobile developers are deploying sophisticated optimizations like intelligent charging to prolong battery-life, pre-loading web pages the user is likely to click on, and similar. Are we still to imagine developers would pour resources into such innovations yet neglect fundamental features? To argue that there’s some endemic issue with cache implementation is borderline absurd.

Final Thoughts

Before jumping on the “clear your cache” bandwagon, let’s discern fact from fiction. A cache is there to speed up browsing. Emptying it, especially for an average user, is counterproductive. It’s 2023, and we ought to know better than to repost dated, debunked tips as fresh tech advice.

So, the next time you see that all-too-familiar recommendation, or someone advises you to clear your cache for that “extra speed”, point them to this article. Or maybe I’m wrong, and no one is posting those articles anymore, it’s just that I have an outdated cache and I get decades-old content when I visit CNET. 😅

Wedding Dance

You’re getting married. You’ve set a date. You are figuring out the everything there is to plan for a wedding. There are binders. Plural. Centerpieces is now a word you’ve thought about more than once. You are looking into caterers, dresses, venues, photographers, and you know why you don’t throw rice (by the way that turns out to be an urban legend), and you are struggling to remain on speaking terms with your family without inviting 73252 people to your wedding.

And you’re going to hire a DJ or band. And they’re going to play a song. Many songs actually, but, for that first one, you’ll be alone with your new spouse in the center of what now feels like a gigantic dance floor and it does seem like 73252 people are watching.

Some couples thrive on this. Some couples have learned the entire song and dance from Dirty Dancing. If that’s you, have fun! I hope it’s a hit.

If you’re still reading, then I have a suggestion. Take a few lessons. It will make all the difference in the world. No one expects you to turn suddenly into Fred Astaire and Ginger Rogers, but you can show them you know what you’re doing. You can enjoy your first dance. You can do more than clutch each other and sway to the music. And, for the rest of the night (and your life together) you can dance together, not just in front of each other.

I have taught hundreds of couples to dance, and I specialize in ensuring wedding couples like you really enjoy that wonderful celebration by turning a stressful moment into a fun and romantic one.

Contact me today to schedule a free introductory lesson.

I guarantee a standing ovation.

What would you say ya do here?

I volunteered for the career day for the 11th and 12th graders at my son’s school and they asked me to explain what I do:

More specifically, they asked me to “please provide a brief job description and list the most important aspects of your current job. This will help our students understand what you do on a daily basis.”

Wanting to be completely honest with these kids who are about to try to pick a school, pick a major, figure out a career, I sent them this:

 

I lead a plucky team of data scientists, engineers, and analysts in finding undeclared nuclear R&D around the world. We built/bought/integrated the software, begged/borrowed/took the data, fused it together into something that can swing a billion data records at our most difficult questions, and trained people in how to wield the tools we’d built.

Disciplines involved:
– large-scale data analytics
– information modeling
– programming
– machine learning
– information visualization
– persuasion
– patience
– impatience
– not knowing when to quit

What would you say?

Converting your American Driver’s License to an Austrian License

So you’re an American expat living in Austria. Whether or not you decide to buy a car here, you’ll need to rent one occasionally, or make use of a car-sharing service like ZipCar or DriveNow, or borrow a friend’s car to bring stuff home from IKEA.

After 30 days, you need an International Driver’s License.

Americans on vacation don’t have to worry about this, because Americans don’t get enough vacation time to stay in Europe for the 30 days before you are *supposed* to have obtained an International Driver’s License. (By the way an IDL is not a driver’s license at all – it’s just a multi-language translation of your existing driver’s license that you are supposed to carry with your regular license.) But those of us who have moved here start to run into the time limits.

Maybe you didn’t bother with the International Driver’s License, and it didn’t matter because rental car companies don’t ask, and let’s say you haven’t been pulled over, so the police have never asked. The next time limit you hit is

After 6 months, you need to convert your American license to an Austrian license

Now, expats can, and do, get away with not doing this. Again, rental car companies don’t check. They’re used to dealing with American tourists who are never going to be here more than a couple of weeks, or at most three months (that’s the maximum stay for an American without a visa). I have heard that at least one car-sharing service, Drive Now, does check your license every six months and they will suspend your membership if you’re still driving on the American license you signed up with. And, I’ve never been pulled over in Europe so I don’t know for sure, but I’m guessing the police might not bother or even be able to check when you entered Europe when you hand them your American license.

You will need to make sure you keep your license from expiring, which means renewing it online and having the new one sent to an in-state address back in the US that you can receive mail at. Or you can renew it in person when you’re on home leave. Procedures vary state to state.

Regardless, according to the Austrian DMV, you’re not legal to drive on your American license past 6 months. Fortunately, getting your American license converted to an Austrian one is not that hard, and doesn’t require a driving test.

What I did to get my Austrian driver’s license

First, you’ll need to be prepared for the fact that you will need to turn your American license in to the Austrian DMV in order to get your Austrian license. And you won’t get it back. In fact, I’m told they send it back to the US.

Continue reading

Breaking your collarbone in a country with socialized healthcare

As an American professional, I’m accustomed to HMOs, PPOs, group policies, co-pays, etc. I’ve now lived in Austria for over five years, and I’m learning my way around its brand of socialized medicine.

I personally opted for private insurance, rather than the public, but the way it works, I often get the same care, particularly when I go to a hospital.

So, two weeks ago, my son and I head home from his swim lesson, he on his scooter, I on my longboard. And I hit a wet patch of concrete I didn’t see and wipe out. Hard. I’m near blind with pain, but I manage to get up, reassure my son, and get him home. In retrospect, it would have made more sense to get to the hospital immediately, but, well, I didn’t.

Anyway, I get to the hospital and here’s how it goes (It’s a Saturday, by the way.):

12:30: Check in. They ask for my eCard, I explain I have private insurance and will be paying myself. Fine, they’ll send the bill to my house.

12:40: Wait.

13:15: See doctor, who examines me, asks where it hurts, and sends me to the Röntgen Ambulanz (in-house X-ray).

13:25: Wait.

13:30: get X-rayed

13:35: Wait.

13:40: Doctor confirms broken clavicle, prescribes Seractil (dexibuprofen), tells me to come back in a week. Nurse fits me for sling and swathe.

13:45: check out

The Seractil costs 7€ ($8) and is very effective on the pain.

The hospital bill comes to 106.80€ ($117). Total. My insurance company will reimburse me 80% of that, so I’ll only pay 21€.

I go back to the hospital a week later (Friday morning). I’m in and out in an hour. Treatment included another X-ray and feedback from the doctor.

Went back today and it was the same – in and out in an hour. It’s healing well, by the way.

So I’ve been to the hospital three times, seen doctors each time, gotten three x-rays, and checked in as a new patient once. Total time is less than I’ve typically spent in a single hospital visit in the US. The cost, even before I get reimbursed, is a fraction of American medical costs.

I’m telling an outpatient hospital story because the experience is the same whether I have private insurance or not. Private insurance makes more of a difference with in-patient or doctor care, but even then the difference is more about comfort – private rooms and whatnot. 

The War on Terror is over. Terror won.

The country which claims to lead the free world now openly spies on its allies and its own citizens. The rationale? Terrorism and crime.

The President-Elect won on a platform that stated America is in trouble, she is in danger, and your neighbor, your neighboring country, your neighboring religion, your neighboring race, your own allies…they are the danger. What danger? Terrorism, crime, and economic ruin.

Parents risk losing custody of their children if they allow them to explore their own neighborhoods. Why? It’s a dangerous neighborhood. Strangers are dangerous, and they are everywhere. Better your children be taken by CPS than by a human trafficking ring.

We spend billions and have invented almost an entire new branch of government with a slew of new acronyms…DNI, NCTC, DHS, TSA to name a few. Terrorism.

Where once we worried about Africanized bees, now we see the spread of militarized police. Crime.

Cameras, ubiquitous now, are not welcome near the actions of these entities. Drones recording the #NoDAPL protests are grounded by the FAA. We can’t see what the militarized police are doing.

We imprison more of our people per capita than any other developed nation.

But terrorism is a danger on par with texting while driving in number of annual deaths. Perhaps we need to reduce the WoT to a few ad campaigns and PSAs. Or maybe create half a dozen new federal agencies (armed, with drones) to fight the WoTWD.

Crime is down too. Actual stranger danger is so rare as to be almost nonexistent. Your child is in more danger at home, indoors.

The economy is doing well. The rich are getting richer faster than anyone, but everyone else is doing better. Global poverty continues to drop dramatically. Goods are cheaper.

There are localized differences. Deaths by gun are high in the US. Deaths by armed conflict are high in the Middle East. Deaths by disease are high in Africa. Factory workers are losing jobs to automation even as Uber makes anyone with a car employable. But globally, and nationally in the US, we are living better than ever before. 

Our greatest actual dangers are actually problems of affluence: we have so much cheap food we are eating ourselves to death, and we have so many cars whizzing around that we routinely smash them into each other with people inside.

We are not acting rationally. We are reacting from fear. We live in fear. But we are like a child who is too scared to go down into the dark cellar, but has to be held back from running into traffic. Irrational.

The only winners are those who profit from fear.

Signal to noise

When people talk about information analysis, there’s often a lot of worry about noise in the data, and the reliability of the data sources. So when you’re building an information analysis system, there are often requirements that have to do with “filtering out bad data” or assigning “reliability scores” to data sources.

But in practice this isn’t usually necessary. With enough data, noise suffers from destructive interference, and signal interferes constructively. I first learned about this in physics class in school, and first encountered it while doing radio astronomy. I was a research assistant on a team that realized we had an opportunity to make the highest quality radio maps ever made of certain galaxies. See, when big stars go supernova, astronomers like to aim the Very Large Array of radio telescopes in New Mexico at them to study how the massive explosions grow and change. They get a lot of data, write their papers, and move on. But supernovas happen in galaxies. And scientists share data. So our PIs realized they had very long exposures of the galaxies containing the supernovae. Radio astronomy “pictures” tend to be noisier than pictures taken by optical telescopes, and noise in a picture of a distant galaxy can overwhelm the detail of star formation regions and the like. However, noise is random, whereas the bright spots (except for the supernova) don’t change noticeably. So if you take the sum of enough radio data, the random noise cancels out, and the actual bright regions become clear. Note, this is not the same thing as astronomical interferometry, although that was a technique we made use of.

The same thing is true of many other types of noisy data. If you only ever look at the data at a point in time, or watch or listen to it as it changes, it’s difficult to see the signal through the noise, but if you have a system that allows this summation to happen, and you can look at the sum, suddenly the picture becomes clear.

Suppose the “ground truth” looks like this:

A draftsman I'm not.

But we have noisy data that looks like this:

I hope that's not an EKG.

If we layer lots of noisy data, we can start to see that the signal’s there…

I drew this better on the whiteboard last week.

But if we can sum the data it looks something like this:

"The hands acquire shakes, the shakes become a warning. It is by caffeine alone I set my mind in motion." --

Now we can clearly see the signal! Is it truth? Not necessarily, but the analyst can now see that there is agreement across the data. If you want more information, you also need a system that gives you a way to dive into the details of what data contributed to the peaks. And now you also have guidance as to where to collect more data, ideally from additional sources. More on that in another post.

This is related to why Google is so good and your organization’s internal Search is so bad. Even though Google’s data source, the Internet, is way noisier than your organization’s intranet (I hope), Google is still better. This is true even if you have an in-house Google appliance. It’s because of Google’s second big data source (equally noisy): the billions of user clicks. Google doesn’t show you the sum of that data, but it does use that aggregation to decide what to show you. In essence, Google finds the peaks of agreement among billions of clicks and shows you just the peaks. Your IT department doesn’t have enough click-data to do that for your organization, no matter how good their software is.

Reliability Scores

You also probably don’t need to assign reliability scores to your data sources, even though it seems like a perfectly logical, even prudent, thing to do. The problem is that the scores will be fairly arbitrary, hard to agree on, and may present a false sense of rigor where there isn’t any. There’s lots of ways a data source can be unreliable, but we’ve found different ways to handle them that avoid these problems. For example:

Problem: There’s hardly any signal (useful information) in the data.
Solution: You’re not going to get any peaks even when you sum it all together. Don’t score it, ditch it.

Problem: There’s a lot of mistakes or inconsistencies in the data.
Solution: That’s the kind of noise that cancels out if you have enough data. If you do, then don’t sweat it.

Problem: The data has been deliberately redacted to remove what you’re looking for.
Solution: The more data there is, the harder it is to do this perfectly. If you get enough of it, you can find what was missed. Also, if you have enough of it, you’ll see mysterious quiet areas of data, because not only is the signal gone, but so is the noise. So you can detect the obfuscation, and you might even be able to catch the deceivers in a mistake.

Problem: The data is out-of-date.
Solution: This is absolutely relevant to the analysts, and it should be documented, but it’s not something you score. The analysts just need to know because data timeliness matters more for some questions than others.

Problem: There are gaps in the data coverage.
Solution: Again, it’s relevant, and should be documented, but it’s not a “reliability” issue. Maybe there weren’t enough 18-25 year-olds in the medical study you’re analyzing. Even so, if there’s a statistically significant result visible for 25-35 year-olds, you’ve still found something; you just don’t know if it works for young people.

Problem: The data’s useful, but its noise is obscuring the signal of other, cleaner data sets.
Solution: Let the users turn data sets on and off as they choose. See, this is actually what people have in mind with reliability scores – they imagine they’ll either weight more reliable data higher, or the user will use the score to decide what to look at. It’s true, you might end up doing some form of weighting. For example, a search engine might weight clicks higher for users who appear to come from a part of the world that speaks the language the results are in, so clicks from Italy have more effect on ranking Italian search results. But you don’t want to do this before you’ve had a chance to work with the data in the system. As for showing the reliability scores to the analysts…believe me, your analysts already have strong opinions on the reliability of individual data sources, and they will ignore your well-intentioned ratings. If you just give them the ability to turn them on and off, they’ll be happy and productive.

In short, signal reveals itself in noisy data if you have enough of it. And have tools that let you work with all of it in aggregate, while still letting you quickly get the details of the revealed signal.