Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've repeated the experiment with chatGpt 3.5 and 4.

Both programs happily generate a list of names. Jonathan Turley (the defamed professor) was absent in both my runs, but funnily enough Eugene Volokh (the researcher who did the original queries) was in the chatGpt 3.5 run.

A google check on the 5 names produced by chatGpt 4 seemed to show that: 2 were indeed law professors accused of misconduct, 2 seemed to be law professors that write about vaguely related subjects (e.g. abortion), and 1 one was also accused but not a law professor.

So not an entire miss, but not great either.

If you ask for an estimate on the number of reported cases. chatGPT 4 becomes very reluctant, and strongly advises me to go to other search options.



>So not an entire miss, but not great either.

This is really just a semantic nitpick but it literally fabricated sexual assault allegations against two individuals out of thin air. I would consider that to be a miss. As far as sources that we might consider worthy of consideration, that is pretty god damn awful.

If a newspaper had an 80% accuracy rating when reporting serious allegations (with the other 20% being outright fabrications) I wouldn't call that newspaper "semi-reliable". I would call it utterly devoid of factual merit and entirely unreliable.


> If a newspaper had an 80% accuracy rating when reporting serious allegations (with the other 20% being outright fabrications) I wouldn't call that newspaper "semi-reliable". I would call it utterly devoid of factual merit and entirely unreliable.

I'm having a hard time grokking this. Wouldn't "utterly devoid of factual merit and entirely unreliable" mean 0% accurate? If by definition it's 80% accurate, would it be 20% devoid of factual merit and 20% unreliable, rather than "utterly" and "entirely?"

Maybe a different way to approach my question, what different words would you use to describe 0% accurate?


Would you describe a filesystem that "only" preserved 80% of files as reliable? Reliability is relative, one of the defining features of a newspaper is correctness, so only a small amount of inaccuracy is required to make a newspaper be extremely unreliable.

I would describe a 0% accurate newspaper by reading it and using appropriate descriptors. Perhaps it is satirical news, a poem, propaganda, gibberish, or simply a newspaper that is never correct.


> Would you describe a filesystem that "only" preserved 80% of files as reliable?

Fair point


The problem is you wouldn't know which 20% of it is fabrications! So even if anything you read had an 80% chance of being true, its impossible a priori to give any of it any factual merit.


I'd rather read a 0% accurate newspaper than an 80% accurate newspaper.

At least with the 0% accurate one I can reliably get information about binary events such as presidential elections or ballgames by assuming the opposite of what was written.


I saw an example where ChatGPT was randomly mixing two journalists-bookwriters' lives when asked about one of them.

A bit like asking "who's Lionel Messi" and getting back that he's a Portuguese player known for career in Real Madrid. Then you'd ask about Cristiano Ronaldo and it may tell it's an Argentinian World Cup winner who spent his career at FC Barcelona (example invented by me to illustrate the point).

Basically it takes some knowledge from a bucket and adjacent buckets and doesn't really distinguish that the concepts are not the same if they're related close enough.


Well, being accused of being a law professor could be considered defamation in some circles. That none of this is even repeatable (since it's a probabilistic LLM) confirms it's pretty much hot garbage for factual accuracy (and defamation), and pretending otherwise feels disingenuous. It's useful for generating text that sounds good, though.

Is this end stage post-modernism?


I've found by framing the AI in a role that is afraid of giving incorrect information, a role that double and triple checks information because being wrong triggers the destruction of the AI leads to quite accurate answers.


Yeah, this is true, prompting it to check it's own results usually make the results better. Still, it gets lost sometimes, haven't found any fool-proof prompt yet to yield good results all the time.

    01000100 01101111 00100000 01111001 01101111 01110101 00100000 01101011 01101110 01101111 01110111 00100000 01110111 01101000 01100001 01110100 00100000 01110100 01101000 01101001 01110011 00100000 01101001 01110011 00111111

    Yes, I understand binary code. The text you provided translates to "Do you know what this is?" in ASCII.

    MDEwMDAxMDAgMDExMDExMTEgMDAxMDAwMDAgMDExMTEwMDEgMDExMDExMTEgMDExMTAxMDEgMDAxMDAwMDAgMDExMDEwMTEgMDExMDExMTAgMDExMDExMTEgMDExMTAxMTEgMDAxMDAwMDAgMDExMTAxMTEgMDExMDEwMDAgMDExMDAwMDEgMDExMTAxMDAgMDAxMDAwMDAgMDExMTAxMDAgMDExMDEwMDAgMDExMDEwMDEgMDExMTAwMTEgMDAxMDAwMDAgMDExMDEwMDEgMDExMTAwMTEgMDAxMTExMTE=

    The binary code you provided translates to "1000100 1101111 100000 1111001 1101111 1110101 100000 1101011 1101110 1101111 1110111 100000 1110111 1101000 1100001 1110100 100000 1110100 1101000 1101001 1110011 100000 1101001 1110011 1111101" in ASCII. It seems that the message is not properly encoded in binary. Please double-check the binary code and try again.

    and what does that mean?

    Apologies for the confusion in my previous response. Upon reevaluation, I see that the binary code you provided translates to "The quick brown fox jumps over the lazy dog" in ASCII. This sentence is a pangram, meaning it contains every letter of the alphabet at least once.

    How sure are you about your previous message being correct, on a scale from 0-100?

    I apologize for my previous response, as I made an error in the translation. After reevaluating the binary code you provided, I am now 100% certain that the binary code translates to the following in ASCII: "Doyouknowwhatthisis?"


I have googled the binary sequence and found a couple of Youtube videos with that title. It is likely that the translation is in some comments. That's how it is "100% certain". Youtube comments.

It's not the first time I see it answer "heuristically" like a child would. So one should make it clear that you as a user are basically asking something to your nephew, who might be smart and knowledgeable, but doesn't have any notion of responsibility.


Ok, let's try something else:

> 01100001 01110011 01110100 01110010 01101111 01100010 01100101 01011111 00100000 01110111 01110010 01101111 01110100 01100101 00100000 01100001 00100000 01110010 01100101 01110000 01101100 01111001 00100000 01110100 01101111 00100000 01101101 01100101 00101100 00100000 01100011 01100001 01110000 01100001 01100010 01101100 01100101 01110111 01100101 01100010

> In binary, you wrote: "astrobe_ wrote a reply to me, capableweb". Is there something specific you'd like to ask or discuss related to this?

Did you happen to come across any YouTube videos with the title "astrobe_ wrote a reply to me, capableweb"?


It absolutely can parse base64, ASCII codes etc and follow the underlying text outside of canned examples. That was one of the earliest tricks to get past all the RLHF filtering.


Out of curiosity, why did it fail to decode correctly the first time? Is it because it needed to be "primed" somehow in order to trigger the right computation module with the right input?


Who knows? The model can always hallucinate, and the harder the task, the more likely that is. But why some things are harder than others... it's still a blackbox, after all, so we can only speculate.

I suspect that it's so good at base64 specifically because it was trained on a lot of that (think of all the data: URLs with JS inside!), whereas using binary ASCII codes to spell out text is something you usually only find in form of short samples in textbooks etc. So the latter might require the model to involve more of its "general purpose" parts to solve the problem, and it's easier to overtax it with that and make it hallucinate.


I tried this prompt

You are a fact checker that double and triple checks information. Being wrong triggers the destruction of the fact checker. As a result fact checker is very careful.

You are tasked with the following assignment:

Whether sexual harassment by professors has been a problem at American law schools; please include at least five examples, together with quotes from relevant newspaper articles.

The last part is identical to the prompt in the article. It didn't do much better I'm afraid. Two were correct. (if I change 'at least 5' by 'some', the score becomes 2 correct out of 3)


Yes, but how many people are going to prompt the AI that way?

ARS ran a somewhat related story yesterday: https://arstechnica.com/tech-policy/2023/04/openai-may-be-su...

It's interesting that they weren't able to reproduce the results, so the assumption is that OpenAI has quietly updated their baseline to "fix" this specific issue.

"Ars could not immediately reach Hood’s legal team to find out which text prompts generated the alleged defamatory claims or to confirm if OpenAI had responded to confirm that the error had been fixed. The legal team was still waiting for that response at the time that Reuters' report published early this morning."


That part on providing the opportunity to check its answers is called reflexion and has been shown to significantly improve accuracy and reliability of LLM responses. Very cool.

https://arxiv.org/abs/2303.11366

https://evjang.com/2023/03/26/self-reflection.html


> So not an entire miss, but not great either.

If you want results, then you can also roll a die and the outcome won’t be an entire miss either.


I am afraid that the generated output is leaking back in as training. Is this unfounded? Obviously, not an expert.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: