Rug? What rug?
So, I know personally that JavaScript dates are hairy. While
writing my new blog site (this site), I wanted a function to
convert a yyyy-mm-dd
format into a plain English format.
Here's my AI prompt:
show me code to translate a "yyyy-mm-dd" string such as
"2024-12-08" to a human readable string such as
"December 12, 2024", in javascript
Not the cleanest prompt, I know, but the question here is not vague. ChatGPT happily spat out a function for it.
function formatDate(inputDate) {
// Parse the input date string
const date = new Date(inputDate);
// Check for invalid date
if (isNaN(date.getTime())) {
throw new Error("Invalid date format. Use 'yyyy-mm-dd'.");
}
// Create an options object for formatting
const options = { year: 'numeric', month: 'long', day: 'numeric' };
// Use Intl.DateTimeFormat to format the date
return new Intl.DateTimeFormat('en-US', options).format(date);
}
Notice what's wrong? Claude also did the same thing, forgetting to set the time zone.
function formatDate(dateString) {
const date = new Date(dateString);
return new Intl.DateTimeFormat('en-US', {
year: 'numeric',
month: 'long',
day: 'numeric'
}).format(date);
}
To fix the code you just need to add this to the options:
timeZone: 'UTC'
Telling the LLM to fix it works well enough too. Otherwise, the function would return an incorrect date (one day behind) if the local time west of UTC. Basically this would happen:
>> console.log(formatDate('2024-12-08'));
<< December 7, 2024
Okay, so the rug. The poor rug. Every time little problems like these pop up, AI vendors do their best to sweep it under the rug, fine-tuning their models to hide it.
AI vendors are sweeping multitudes of common problems under the rug to make their models appear better in benchmarks and the public eye. Is this helping the models? I'd say no. Tuning specific subjects is inadvertedly going to affect other subjects, much like adding noise to a clean signal.
It looks good on the surface, for example, OpenAI scoring new records with ARC-AGI-Pub, but the scope is so narrow and focused like we're ignoring the broader goal. I think the best results are when language models are in a more "natural" state with broad, non-specific tuning.
I don't have a proposal of how to achieve the next order of reasoning, but the current direction is a bit concerning.
Hopefully we'll see more progress in the near future, but I wish vendors would be a little more honest about the state of AI.