There is uncertainty and confusion about where and when to use AI-generated content. Jonathon says, “So, the society is wrestling with this topic of where is AI content allowed and good and where should it never go? It's not necessarily like, hey, this is good, this is bad, but it's the spectrum that I think society still needs to sort of identify and deal with. But I think it's a topic that is going to sort of play out potentially in the courts around like, where is it legal?”
AI-generated content has skyrocketed in the last few years, but consumers are still looking for that human element when it comes to content. Jonathon says, “I think what we've seen is, on that question of like, where are we okay with it, where are we not okay with it? I think society generally agrees that reading an online review that was generated by AI is not what people want. People want to know that what they were seeing was the true human experience of that product. What we've seen is that sort of, as you could expect, with the launch of ChatGPT, the amount of AI that existed on some of these review platforms absolutely exploded.”
AI-generated content can be valuable but brands need to understand how it will impact their business if used improperly. Jonathon says, “I'd say don't bury your head in the sand as it relates to AI and the impact on your business. The unthoughtful use of AI within your business and within its marketing can lead to significant consequences for your business. That's worth thinking about. If you're okay with those risks, you can hammer AI. If you're sort of concerned about what you have built and want to make sure that that's protected, putting in place sensible controls to make sure that AI doesn't run wild in your business causing irreparable reputation harm.”
Listen to this week’s episode to learn more about how businesses can use AI-generated content to build confidence in their brands.
About the Guest:
Jonathan Gillham is the Founder and CEO of Originality.AI, a trailblazing venture in the realm of artificial intelligence and content originality.
Jon founded Originality.AI after successfully building and later selling two content marketing agencies. His entrepreneurial spirit didn't stop there, Jonathan recognized the immense potential of artificial intelligence in transforming content creation and verification, so in November 2022, Originality.AI was born, driven by the vision to detect GPT-3 generated content (before the launch of ChatGPT).
Having been one of the earliest adopters of generative AI content for SEO purposes at scale through his agency he understood the wave that was coming which ChatGPT and GPT-4 have fully unleashed. He also recognized the need for a modern plagiarism-checking solution that delivered advanced features such as scan history, detection scores, shareable results, and team access, all of which are now integral parts of Originality.AI.
Jon's work has garnered attention from renowned publications like The New York Times, The Guardian, and Axios. His expertise in AI content detection is shaping the narrative of the digital era.
Erik Martinez: [00:00:00] Welcome to today's episode of the Digital Velocity Podcast. Tim and I are excited to introduce Jonathan Gillham of Originality.AI, a platform that detects AI-generated content and plagiarism to ensure authenticity and integrity. With a strong background in technology and entrepreneurship, Jonathan is dedicated to restoring trust in online content through innovative AI solutions. Jonathan, welcome to the show.
Jonathon Gillham: Yeah. Thanks, Erik. Thanks, Tim. Thanks for having me.
Erik Martinez: Before we get into the heart of the topic today, do you mind giving our listening audience a [00:01:00] brief synopsis of your journey to this point in time?
Jonathon Gillham: Yeah, sure. So, took mechanical engineering at school, worked in oil and gas, always wanted to get back to my hometown, built up some online businesses. Those businesses generally focused around publishing content on the web, getting traffic from Google, and then monetizing that through whether it was an E-commerce site or a content site. And then that led to building a content marketing agency where we worked with a bunch of writers and clients that were buying content from the agency. Sold that, but through that process had seen the wave of generative AI that was coming that predated ChatGPT from tools like Jasper AI, and that was what led to the identification or need for a content authenticity tool like Originality.
Erik Martinez: That's interesting. You know, I was reading an article yesterday. It was actually an opinion piece. I believe it was on USA Today where the author was talking about free speech. I'm not going to get into the politics of free speech, but one of [00:02:00] the interesting things they were talking about is all the different kinds of speech and whether it's protected or not, and whether lying, you know, nonfactual speech is protected speech.
In the context of what we're talking about, the concept of misinformation or inaccurate information or veracity of information, because we all rely so heavily on our devices to get information, is a really, really important thing. One of the things we have seen from an SEO standpoint in the past, I'll just give you an example. We were seeing Google downgrading listings where you were using stock images. And the more that stock image was found by Google, it was downgrading that particular listing because they were using stock images and not original imagery.
Today, with this abundance of AI tools, the prevalence of Midjourney and other tools like [00:03:00] it, where you can kind of create everything and it is AI generated, how does that play out? I think people are struggling. Brands are struggling with this. Consumers are struggling with the idea that the information I see online is not always factual or trusted.
And I remember way back in college, I had a professor, she gave us this assignment. I actually happened to get the right answer, but I'm not sure that I got the right answer for the wrong reason, because she sent us out on this fact-finding mission, a little research project. We came back, half the class got it right, half the class got it wrong. It was all about the source of information that we found. I think we're having this conversation today, and it's really, really important because content is driving what we're doing. And with AI becoming more prevalent, how do we properly go through and verify and trust the information that we're seeing?
Jonathon Gillham: It's an interesting and challenging question right now. So, the [00:04:00] society is wrestling with this topic of where is AI content allowed and good and where should it never go. You know, I think if we heard a political speech and then found out that was purely AI-generated, I don't think people will be very happy. If you're reading online reviews about a product, especially if that product deals with, you know, the health and safety of a loved one, not very happy. If it's summarizing the weather. Yeah, you know, that's probably okay.
It's not necessarily like, hey, this is good, this is bad, but it's the spectrum that I think society still needs to sort of identify and deal with. We try and be that solution for people when they want to understand it. But I think it's a topic that is going to sort of play out potentially in the courts around like, where is it legal? Where does the creator of a piece of content have ownership of that content, especially if that was based off of something that was fed into the LLM? Like if a copyrighted piece gets rewritten, who owns it? So, I think there's like the legal aspect of how this is going to play out. And then, I think there's also a [00:05:00] sort of societal acceptance of where it is acceptable and not acceptable to use AI.
Tim Curtis: It's going to be very similar to, in the U.S. in the early 2000s, the debate over online music and the ability to download online music and who controlled that. The way that it was set up, of course, they hadn't ever dreamed of streaming rights. That was not clear. Eventually, what had to happen is you did have to have some compromise and some agreement outside the court, but eventually the court had to set the precedent for case law.
I think you're spot on. Everything we know about the process thus far is everybody's collectively waiting for a court to adjudicate the matter. The interests are pretty well entrenched in their sides. Anyway, it'll be interesting to see sort of how that plays out.
Understanding that your team has done some studies specifically on some of these websites, like Glass Door and Amazon, and how some of that [00:06:00] UGC is either AI altered or AI original. What are the implications of something like that on a review site like that? How does that play out and what did you guys kind of come up with as an answer?
Jonathon Gillham: I think what we've seen is, on that question of like, where are we okay with it, where are we not okay with it? I think society generally agrees that reading an online review that was generated by AI is not what people want. People want to know that what they were seeing was the true human experience of that product.
What we've seen is that sort of, as you could expect, with the launch of ChatGPT, the amount of AI that existed on some of these review platforms absolutely exploded. It was around 5 percent that predated of reviews. We'll use G2 as an example. So, 5 percent of reviews pre-GPT3 were generated. So, some of that would be false positives. Some of that would be AI reviews from GPT2 and similar AI tools that predated GPT3. And then, it increased to sort [00:07:00] 15 percent before ChatGPT launched. And then after ChatGPT launched over a third of reviews on G2 in particular were AI-generated.
Some of those might be, you know, the way that I use it sometimes where it's my own thoughts and then just get the tool to edit it. But some of those were sort of malicious reviews with the intent of manipulating the rankings, and we sort of also looked at when we do these reviews, we look at the extreme. So, if it was just people using it for the reason that I mentioned of editing their review, making it grammatically correct, and that would be one thing. What we saw was that there was a disproportionate number of low reviews and high reviews were AI-generated. Showing that it was definitely being used for the purposes of review manipulation.
Tim Curtis: For me, when we talk about trust brands have to be careful because the customers online, they want to make sure that they can trust that the product is good, that the product is living up really to the information they have on their [00:08:00] website. So, for me, the trust issue boils down, not so much was AI used in this, but more of the intent. AI is an editor, right? So, that's a fairly easy concept to understand for folks. I think everybody understands the role of an editor to kind of help shape and make things concise and make things grammatically correct.
But to your point about malicious or misleading or attempting to put your finger on the scales, that's really where I think this is kind of taking shape, and we're seeing the trust issue kind of separate into that manipulation, I'm going to call it manipulation, of reviews or votes, for example, to push the product higher in the algorithm. That's the kind of thing that I think for sure is a trust issue. And a platform like Amazon that already has trust issues because of the number of counterfeit products, this is something that in particular brands like that are going to have to really, really figure out
Jonathon Gillham: For sure. Review manipulation is not new. But like [00:09:00] anything with generative AI, the scale at which it could be done has drastically increased and we've lost that sort of ability to read reviews, whether it be sort of the language that was used, the genericness of the review, if all the reviews were broken English, just sort of there were other tells.
Now those reviews can be incredibly detailed, sound like personal experience, long, well written, and it has really drastically increased the inability of us to sort of have our sixth sense that we developed over, you know, decades on the internet to be able to tell what was real and what wasn't real. That's been sort of all thrown up in the air.
Erik Martinez: What do brands do? We're increasingly in this world where content is more and more important. That content can be generated faster and faster using generative AI tools. How do brands balance that need to generate content to help drive their business, tell the story of their products, communicate benefits and value [00:10:00] to their customers? How do they balance that with the need to also police to a certain extent, the content that they're producing as well as the user-generated content that is being generated on behalf of their brand? How do they really go about starting that process of building processes to combat this particular issue?
Jonathon Gillham: I'm very much pro AI. It's a phenomenal tool that can be used in so many incredible ways in the world of marketing. Where brands can go awry is that they don't think through this problem and don't have a policy in place and how to deal with it. I think there's great use cases around, you know, if there's a image that gets generated and it's playful and useful and communicates a benefit and it's not misleading, that's a great use case. We have a pool business and had ChatGPT generate a list of 365 pool jokes so that in this are like every day that an invoice goes out, it inputs one of these jokes. Great use case. Had to make sure that they're all PG, [00:11:00] but it was a great use case.
So, what do companies need to do? I think it's think through where they want to and not want to have AI be used and how much of a human in the loop they need to have. There are brands that are totally screwing this up where they are not having a human in the loop, thinking they've just driven their content costs down to zero, letting loose or not controlling it and someone in marketing is letting loose on the content machine, and it's causing both reputation damage if there's a hallucination and inaccurate, potentially dangerous information gets shared.
There's consequences in the eyes of Google when brands publish too much AI-generated spam. Step one is to think through and put in controls on where AI is allowed and not allowed to be used and then put in the right mitigating controls to be able to measure when it has been used.
Erik Martinez: Taking that a step further, you were talking about, you know, how there's potentially a Google, I'm not going to have to use the word penalty in the classic SEO sense, but a Google penalty for releasing a bunch of spammy [00:12:00] content. How are the search engines really taking a look at this content?
SEO is still a very important part of what we do as marketers in the E-commerce space. So, how do we leverage these tools to help us do that job better, faster, smarter, yet still be authentic? You were talking about policy. There's policy, but there's also the creation process, right? So, what is it about that creation process that brands can do or leverage to ensure that they're striking that right balance?
Jonathon Gillham: Yeah. So, I've been sort of in content space, SEO space since early 2008 timing. I think there's been this sort of pretty consistent direction of focus where when I first started, there was maybe like 90 percent focused on the algorithm, 10 percent focused on the audience. That was probably like a reflection on me at the time as well.
But now I think it's certainly flipped the other direction where most [00:13:00] good content strategy is 90 percent focused on the audience, 5, 10 percent focused on making sure that you're doing what Google wants. You're not duplicating content. You're answering the searcher's intent with an answer right at the top of the article. You know, I think there's a lot of things that we have moved to, and I think we'll continue to move to around focusing on the audience and solving the searcher's intent. I know that's like really vanilla. Where's the edge and where does the AI help produce the edge?
I think what it can do is it can help in the content creation process from A, helping you understand the structure of your site. I think that's a great use case. Identify gaps you haven't talked about. I think in the actual creation, really hard for it to be the main horsepower if you're gonna use content as a differentiator for your brand, really hard for that to be generated by AI because it doesn't have your unique voice cooked into it.
Using it for editing, QAQCing, fact-checking aid, I think [00:14:00] grammar or spell checking, great use cases. Image generation, great use cases. But I think the actual sort of thinking through the creation with your specific voice is hard for that to be authentically done with a click of a button through AI.
Tim Curtis: So, I have a question. We were talking before the show. And again, this, kind of goes back again to the usage of AI. So, on Wednesday had lunch with some friends. Two of them are also writers. The topic naturally changed and started to discuss AI.
Because with anybody these days, that seems to be the main topic, at least it's taken all the oxygen in the room. And as we were chatting before the show, I mentioned that one of the things that we were kind of noticing is that the better the writer, the more percentage of your content seemed like it was getting graded as AI. You mentioned there may be some truth to that, but I'm just curious, in the scheme of content, what's your position on that? What do you think is happening there?
Jonathon Gillham: Yeah. So, AI detectors like [00:15:00] ours are what's called a classifier. And so the way that they are built is that they are trained with millions of records of human content, millions of records of AI content, and then start to learn the difference between the two. Identifies a whole bunch of sort of unseen to the human eye, but seen to the AI similarities and fingerprints and patterns that exist within AI-generated content that doesn't exist in humans and vice versa.
Then, when it gets given another piece of content, it says, do I think this is AI? Do I think this is human? And then there's signs of confidence score to that. Detectors, I can speak to ours and speak to the others that we've tested, are not perfect. Given 100 articles that are AI-generated, especially if it's just sort of vanilla put into ChatGPT, create this piece of content, 99 percent plus accurate on identifying AI as AI.
Similarly on human content will generally get 97 percent of human content right. So, if you give it a piece of human content, it will get it right 97 percent of the time, meaning 3 percent of the time, it [00:16:00] doesn't get it right. And that's called a false positive where a human content has gone in and then it's been identified as AI. There are a few reasons why that can happen.
If the type of content that it is looking at is not well represented in the training data. So, if it's like legal text, the accuracy drops, false positives go up, accuracy drops. If it's just really weird formatted text that isn't represented in the training data of the detector, then it gets it wrong.
Similarly, one of the easiest ways to make sure human content gets identified, or content gets identified as human, is to introduce a bunch of errors because AI doesn't produce errors. And so, the false positive rate on poorly written human content is really, really low because the tool knows that AI would never write like that.
And so, it's not necessarily, incorrectly identified. I mean, it's the end result that it is, but it's not so much that it is incorrectly identifying it thinks well-written content is AI, it's that poorly written [00:17:00] content, it knows it's human. And so, it does feel there's a negative truth to that, where great writers will see a higher rate of false positives than still, and that's for like 3 percent range, than really poor writers where it's like, yeah, there's no way, you know, AI is going to produce that pile of crap.
Tim Curtis: So, advanced writing that makes sense. So, advanced writers will break grammatical rules for emphasize for energy and energy in the writing. You'll see different types of things that will be outside the context of your traditional grammar rules. That's because when you're writing at that level, and especially if you're writing persuasive or creative writing, you'll take liberties with that and those liberties could have those ramifications.
Let's flip it then. If you're a brand and you're wanting to make sure you're writing the technical copy, as well as the descriptive copy, you have to now start thinking about, how will the algorithm judge this. How do you approach that?
Jonathon Gillham: [00:18:00] Yes. Let's talk about Google for a second, how Google is viewing. So like, why do we care? Why do we care if it's AI-generated or not? If it's AI-generated and it's great, do we care? What we've seen out of Google is that they're facing an existential threat where if their search results are filled with nothing but AI-generated content, why would somebody go to Google? Why wouldn't they just go to the AI that would have more rich information about the person that's doing the searching?
But at the same time, Google is also facing an existential threat where if they don't embrace AI, this could be their Kodak moment where they've sort of invented this technology, but then let other companies commercialize it and be left in the dust.
So, they need to take this sort of anti-AI content in their search results approach while having sort of posture on being positive about AI and developing AI technology. What Google has said is that we don't care if it's AI-generated or human-generated as long as it adds value. That's what they say. As is always the case with Google, you can't always trust what they say.
What we have seen is [00:19:00] that they've taken incredibly aggressive action against AI spam. And so, not all AI is spam, but I think all spam in 2024 is AI-generated. And what we've seen is that in particular March update, where they de-indexed thousands of sites, the majority of those sites were filled with almost nothing but AI-generated content.
It could have been other markers as well, but it is very clear that sites that rapidly increased their publishing with AI-generated content, come to a negative end. And so, to your question, how do brands make sure that they don't have that happen to them? Which is something that we saw where we had brands coming to us after that March update saying our writers didn't use AI.
We ran an analysis and we're like, they did. You didn't have any control in place. You didn't know that they were publishing at an increasing rate and they were using AI and they lost their business as a result of Google's manual action.
What do companies do? Policy in place. And then, what happens if it [00:20:00] is technical content? How do you still make sure that AI is not involved? There's always a combination of tools that you can use. So, detection being great. Also, writing environment monitoring. So, if somebody creates a document in a Google document, we have a free Chrome extension that lets you visualize the creation process.
And so, oftentimes what you care about is that yes, a human was heavily involved in the creation of this document so that you can trust the output, and using a Chrome extension with the Google document, you're able to visualize the entire creation process of that document to know that a human was involved in the effort associated with that document's creation.
Tim Curtis: That's the augmented intelligence, the human and the AI working together to produce something even better.
Jonathon Gillham: Yeah, exactly.
Erik Martinez: That's really fascinating because I think the challenge is, as a brand, how do I leverage a tool like yours, like Originality.AI in my business process and our content creation process? But also, how do we leverage [00:21:00] that same tool to check the veracity of the research and all the other things? That was a really good tip.
Do you think that there's going to be a wide adoption of detection tools like yours amongst consumers? Because at the end of the day, you know, the brands, we can all go out and leverage these tools and try to hit best practices as they continue to evolve in this space. How do we communicate to consumers that the information that they're seeing is quote verified? Is that the responsibility of the search engines and the platforms to do? How do you see that evolving?
Jonathon Gillham: So, I don't think there'll be a heavy adoption of detection by consumers. I think platforms will be a police. And I also think the importance of the author will increase. The question that people ask is like, can I believe this piece of content? And if [00:22:00] it's a piece of content that, you know, Erik, your name, as the author, you're sending it as an email, I think that reputation risk is there to be lost.
Where if you use AI then it gets shown that using AI and you're producing factually inaccurate information and not adding value, then your reputation gets harmed. What's going to be that filter that most consumers are going to use is about the trust in the brand, the trust in the author that is sharing that information. I think that is going to be more significant from a consumer standpoint than them using a tool that says, I think this might have been created by AI.
For example, our AI research team, the core AI research team is English as a second language individuals. Brilliant people share incredible studies and analyses but use AI in the help in the creation of that piece of content. We publish that. We have the author attached to it on who created it, and I feel good about that from a sort of reputation risk [00:23:00] standpoint because there's no way that AI was creating a piece of content because they just, it wouldn't have the data it needed that human to provide the insight and then AI came along to clean it up. What can brands do is have that traceability back to the author and maintain a brand reputation that lets people believe the work that they're reading.
Erik Martinez: Yeah, I'm wondering, many of us just because it's human nature to a certain extent that we have preconceived notions about authors or the veracity of the information based on who that person is. In industry circles, I think it's easier in some ways to maintain that reputation because there's only a certain number of people in that space, right?
Most industries are a lot smaller than they look at the surface when you really dig down into them. Right? So, it's a pretty well-known cohort. I'm still struggling with the idea that you know, the consumer, take my behavior, [00:24:00] for example. You know, I will read a handful of reviews if I'm really interested in a product. I'm researching EVs.
My process when I researching something like that is, you know, I'll go find, you know, Car and Driver. I'll find articles on Edmonds. I'll go find articles in multiple different sources and just kind of cross-correlate them. So, I can say, okay, I'm seeing a consistent pattern of good reviews about this vehicle from a variety of sources, but I'm not sure everybody does that.
I'm still kind of curious. How do we get over some of that confirmation bias? Just because Tim said it, it's true. Right? Tim may have had a very bad hair day and put something out and it was like, I didn't have the time. I didn't put, you know, all the thought Tim would never do that, but the fictional Tim. We all have those days. So, how do you get around that?
Jonathon Gillham: [00:25:00] For right now, I don't think you do. And I think to your point, I think your point is really valid where, you know, if you're in your world that you deeply understand, you're seeing multiple interactions with an individual. They have a reputation to increase or decrease. But if you're just sort of like quickly looking at, buy a bike online, that's not your world. You don't have enough touch points where that reputation precedes that interaction that you're having.
I don't think there is any immediate solution to that problem in that I think that is what the world that has been introduced to us with generative AI. Similar to, you know, Like, you know, going back to like the 2000s, like, well, you know, don't use Wikipedia for sourcing. Can't believe what you see on the internet. We have a long history of that. I think that has just now, it's been taken to the next level where we have to be even more skeptical of what we're seeing.
My hope is on, you know, where does that get to, and I don't think we're the solution to this problem, but I think, you know, in general, like, how does society deal with the problem of things online being untrustworthy because of the amount of AI slop [00:26:00] that exists? I believe that answer is going to lie in, some solution, you know, Clout was around for a period of time with a sort of measure, sort of social media. I think that's likely going to be where this lands is that the reputation that will follow you around on the internet and that will be a part of the solution, but, but I'm not sure.
Erik Martinez: Yeah, well, we're all making up the rules as we go along anyway.
Tim Curtis: Yeah, we're all surfing it. I mean, no one knows. We don't even know how the courts are going to rule. I mean, we know that AI is here to stay. It is a reality. There's no getting around that. The pervasiveness in use is mind-boggling, you know, when you think about, numbers. So yeah, we'll see kind of how this stuff sort of begins to play out. If you had one piece of advice to leave with the listeners, one last piece, what would that be?
Jonathon Gillham: I'd say don't bury your head in the sand as it relates to AI and the impact on your business. The unthoughtful use of AI within your business and within its [00:27:00] marketing can lead to significant consequences for your business. That's worth thinking about. If you're okay with those risks, you can hammer AI. If you're sort of concerned about what you have built and want to make sure that that's protected, putting in place sensible controls to make sure that AI doesn't run wild in your business causing irreparable reputation harm.
Tim Curtis: What's a good way for people to reach out and get you?
Jonathon Gillham: Yeah. So, the website is originality.ai, and you can find me there or on LinkedIn, Jon Gillham.
Tim Curtis: All right. Well, thanks, Jon, for coming on. Thanks everyone for listening. This is all for today's episode of the Digital Velocity Podcast.
Have a good week. [00:28:00]