Podcast Episode

Why and How to Use AI for your Website Content

with Jon Gillham

Episode Notes

Summary

In this episode, Jonathan Gilham, founder and CEO of originality.ai, discusses AI-generated content and its impact on SEO. He explains that Google is in a tricky position, as AI-generated content could potentially replace the need for Google search. However, Google also needs to ensure that its search results are not filled with spammy AI content. Jonathan emphasizes the importance of collaboration between AI and human moderators to maintain authenticity and trustworthiness in content. He also discusses the challenges of identifying AI-generated content in review platforms and the potential risks of AI in social media posts, videos, and images. Jonathan predicts that authorship will play a significant role in content authenticity in the future.

Links

https://originality.ai?lmref=4GukjA
https://www.linkedin.com/in/jon-gillham-80912a14a/ https://www.linkedin.com/company/originality-ai/
https://twitter.com/aioriginality
https://www.facebook.com/groups/5519865934778966/

Takeaways

  • Google is facing an existential threat from AI-generated content, as it could potentially replace the need for Google search.
  • Collaboration between AI and human moderators is crucial to maintain authenticity and trustworthiness in content.
  • Identifying AI-generated content in review platforms is a challenge, and platforms are working to mitigate this issue.
  • AI-generated content in social media posts, videos, and images has the potential to cause mass misinformation and scams.
  • Authorship will play a significant role in content authenticity, and publishers need to be aware of the risks and value of AI-generated content.

Chapters

00:00 Introduction to AI-Generated Content and SEO
02:39 Google’s Tricky Position with AI-Generated Content
07:11 The Crucial Role of Collaboration Between AI and Human Moderators
11:15 Challenges in Identifying AI-Generated Content in Review Platforms
20:38 The Future of Content Authenticity: The Role of Authorship

 

Free Website Evaluation: FroBro.com/Dominate

Transcript

Jeffro (00:01.101)
Welcome back to Digital Dominance. Today we’re talking about AI, specifically with regards to content originality and how that relates to SEO. We’re going to talk about Google’s perspective on AI -generated content, the challenges and opportunities presented by AI in content creation, and the crucial role of collaboration between AI and human moderators. My guest today is Jonathan Gilham, founder and CEO of originality .ai, where they are on a mission to detect AI -generated content in order to restore value and trustworthiness to the internet. Now, whether you’ve tried using AI -generated content or you’re simply curious to learn, this is going to be a fun conversation to learn from. John, welcome to Digital Dominance.

Jon (Originality.ai) (00:39.982)
Thanks, Jeffrey. Yeah, always happy to nerd out on this topic.

Jeffro (00:43.885)
Yeah, I mean, there’s so much here too. So maybe can we start by having you give us your 30 second version of how you became an expert in generative AI.

Jon (Originality.ai) (00:53.518)
Sure. Yeah. So, my background was actually in, in content marketing. So had a handful of businesses built a cut that were related to getting content on the web, attracting traffic from Google. and, and that was the sort of primary customer acquisition path for most of my businesses built a, content marketing business, sold it and had seen at the time we were a super heavy user of Jasper and then had seen this wave coming around generative AI.

And then knew that we needed to build a solution that would help web publishers understand if their writers were using generative AI or not. And then we ended up launching after the weekend before chat GPT launched. And then things got a little crazy after that.

Jeffro (01:35.533)
Yeah, I imagine so. What was that like?

Jon (Originality.ai) (01:39.406)
Yeah, it was, I mean, we wish we had a little bit more time, but because we were kind of one of the first, you know, purpose built solutions for detecting if generative AI had been used. And I think we would have loved to have more time to sort of educate the market. And I think it’s been this, this noise of chat GBT has been so incredibly overwhelming generative AI in general that are, I wish we had had more time to sort of help educate people around.

the efficacy, the limitations, and the overall accuracy of detectors before that launch.

Jeffro (02:14.445)
Yeah, because a lot of people found out the hard way that it could just make stuff up, right? And cause other sorts of problems. So when we talk about AI -generated content in the context of SEO, the first question is always about whether or not Google will penalize content if it detects it was written by AI. My assumption has always been that Google doesn’t care as long as it’s providing real value and people like it. But…

Jon (Originality.ai) (02:18.158)
Yeah, exactly. Yeah.

Jeffro (02:39.277)
I’m curious for your experience and your opinion, can you shed some light on how Google views that AI generated content?

Jon (Originality.ai) (02:45.006)
Yeah, I think, you know, no one knows. Maybe not even Google. Right. That’s kind of the way they work with their capability of understanding their search engines. So the way I think about it is sort of super high level and then like what’s the data showing and sort of try and sort of understand that with those two lenses. And so I think Google is in a really tricky spot where they’re facing an existential threat.

If their search results are filled with nothing but AI generated content, then why would people go to Google? Why wouldn’t they just go to the AI that provided that answer? And Google needs to present this AI first, AI forward. I mean, Google I .O. didn’t mention search much, but mentioned AI a lot. They need to sort of present this like AI forward stance as a company. Yet if their search results are nothing but

Jeffro (03:31.821)
Mm -hmm.

Jon (Originality.ai) (03:39.842)
AI then they potentially kill their cash cow, which is their search and ads. So that’s sort of like from a super high level, you know, what they say, always questionable and listening to Google and what they say, I think what they say is spam bad, in any form spam bad. And I think what’s true right now is that not all AI content is spam. I think there’s great uses for AI content.

Jeffro (03:59.533)
Right.

Jon (Originality.ai) (04:07.086)
But pretty clear that anybody that is spamming the internet with text right now would be doing so with AI. Not all AI content is spam, but all spam is AI right now. And then what we’ve seen in the results within Google is their March 5th update, when they took manual action, the majority of sites that got de -indexed had the majority of their content was AI -generated content compared to the sites that didn’t get as impacted, which had a lower amount of AI content on it.

Jeffro (04:31.789)
Hmm.

Jon (Originality.ai) (04:37.55)
And so again, maybe that’s all spam, but some of those sites maybe weren’t all spam. And then what we’re seeing also in the search results is moving from sort of 4%, 5 % of the search results were AI content. And now up to 13 % of search results are AI generated content. At what point do is Google say, you know, I think Google is fighting a war against AI spam and they are.

ultimately currently losing that battle. And so that’s my view on sort of the complex question of the relationship between Google and generative AI text.

Jeffro (05:18.285)
Yeah, well, I mean, I think they could still have a value add as a search engine, right? Because if they’re able to detect some of this AI content, then they just add a filter at the top and says remove known AI results. And then, OK, now you can see just the ones that are verified human or whatever, however they do it, right? And that’s not going to be 100 % accurate, but that might be a helpful place to start, especially if we’re talking about, you know, people are worried about.

what sources they can trust, you know, and this is a way to kind of bring back some validity and let the user know what they’re getting instead of just hoping it’s right. They’ll know at least, okay, a person wrote this as opposed to just, you know, it got spit out by some AI.

Jon (Originality.ai) (06:00.398)
Yeah, that’s an interesting, interesting idea. And I think, yeah, I think Google is asking for that in a few different parts of their product. So like Google merchant center is asking for if the product is description or listing is AI generated for them to disclose that. So yeah, it could be, could be sort of a direction that they end up going in search as well.

Jeffro (06:22.061)
Yeah. And I think it depends on what the type of content is because for a lot of stuff, who cares if it was written by AI. Like if it’s a description of a widget or a t -shirt that you’re buying, like, okay, it’s a red t -shirt with a thing on it. I don’t care that a person wrote it, like as long as it describes it, that’s fine. But if it’s an analysis of, you know, some political bill that just got passed, okay, maybe I want to know who wrote this, what their biases are.

Jon (Originality.ai) (06:38.508)
Yep.

Jeffro (06:49.581)
Or if it’s some educational content, obviously I want to know that it’s accurate. And so yeah, it is a tricky situation. So I want to kind of move from here into your approach and the collaboration between AI and human moderators. That seems to be effective, right? Can you share some insights on how that typically works?

Jon (Originality.ai) (07:11.982)
Yeah, so the way, so again, we’re pro AI, we’re not sort of anti AI, we use it ourselves. I’m an engineer that would much prefer to think in spreadsheets than words and the use of chat GPT or other tools is a big lift for my ability to sort of communicate with text. The people that we’ve seen use the

Jeffro (07:26.219)
Mm -hmm.

Jon (Originality.ai) (07:39.822)
tool and work with writers in the world of generative AI most successfully. and whether that’s their marketing agency that they’re hiring, it’s sort of establishing an understanding on what is and isn’t allowed upfront. having a relationship with whoever they are working with so that any misunderstandings can be resolved and then putting reasonable controls in place. Originality being, being one tool that works to put some reasonable controls in place.

And then having a final optional check where they can visualize the creation process of a document. We have a free Chrome extension to help people do that, where if there’s ever a false positive where the tool incorrectly identifies content as AI generated, it’s able to go and visualize the creation process to know that it was truly, truly human created. And so that kind of process of establishing and understanding what’s allowed, having a relationship with.

who the writer or marketing agency is that you’re working with, having reasonable controls in place, and then knowing how to handle any sort of situation where there’s a AI was used. If it’s not supposed to be used, it can be handled.

Jeffro (08:53.293)
So can you talk a little bit more about that visualization you mentioned of the path of getting there? Is that like an audit log essentially that, you know, Jim logged in at this time, he wrote a hundred words, one character at a time, then he pasted a block of text? Is that what it is?

Jon (Originality.ai) (09:10.318)
Yeah, so exactly. Inside Google Documents, there’s a whole bunch of metadata that gets saved within each document. And so you can open up the revision history, and it gives you sort of pretty high level writing sessions and number of characters. What our Chrome extension does, and again, make it totally for free for people to use, is it’s able to go in and extract all that information and then create a character by character recreation of

the writing process that that document went through that you can watch at accelerated speed. And so it’ll sort of visualize the creation of that document, which is pretty cool to see. And so like, you’ll see how people write, like some people will like write a whole bunch and then edit that and then write a whole bunch more and edit that. Some people will write top to bottom and then go back and edit from top to bottom. So interesting to see how people write. So that’s one part of it. And the other part of it.

Jeffro (09:45.485)
Interesting.

Jon (Originality.ai) (10:06.222)
provides that sort of audit report. So there’s the visualizing and then there’s the report and the report provides a handy graph that shows the creation over time. So if you nothing straight line up 5 ,000 characters, you know, a thousand words has been written in one minute in one writing session and I had a hundred percent probability of being AI generated. Pretty hard for a writer to dispute that situation. And so again, if…

Now, sometimes people will write in Word or Grammarly or whatever it might be, and then copy and paste into their Google Doc. And so that sort of comes back to that first step of sort of establishing a process with who you’re working with so that everyone knows, okay, we’re going to use Google Docs. And because of that, we’re going to have this capability in the future if there’s a disagreement on this content.

Jeffro (10:47.437)
Right.

Jeffro (10:57.485)
Okay, well that makes sense, especially on the content generation side. What about identifying AI where you don’t want it to be, like review platforms, you Glassdoor, Amazon, how do you know that they’re real reviews or that someone didn’t just generate it and paste it up there? Because you guys did some studies on that, right?

Jon (Originality.ai) (11:07.734)
Yep.

Jon (Originality.ai) (11:15.854)
Yeah. So we did, we did a lot of studies. I mean, I think, you know, you, you, you raised a question around or humming around like, you know, there’s places where like, who cares if it’s AI generated, like a, a summary of a sports score. Great. Like give me, give me the info. That’s, that’s what I wanted. Awesome. Thank you. I think society is pretty much in agreement that, reading reviews, we don’t want to be AI generated. We don’t want a derivative of.

all existing knowledge and all existing reviews. No, we want to understand how a person interacted with that product, that place, that thing, whatever it might be. And so what we looked at was review websites and their rate of content that was AI generated going back pre -GP2, pre -GP3, and then into post -chat GPT launch. And what we’ve seen is sort of in line with our false positive rate, like 3%, 5%.

up to GPT -2, and then three, it started to ramp up. And then post -ChatGPT, some websites were getting up to 30 % of their reviews were AI generated, which, you know, if you’re reading reviews, you don’t want to be completing a Turing test every time you’re going online to say, like, am I talking to a computer? Am I reading this from an AI? Or am I reading this from a human?

Jeffro (12:23.851)
Wow.

Jeffro (12:37.869)
Well, that’s a big problem. I mean, I would hope that the platforms themselves are doing something to try to mitigate that. Are some platforms better at that than others? Like, is Amazon going to catch those more often than Captera or we don’t know.

Jon (Originality.ai) (12:52.43)
Well, I think they’re all getting up to speed effectively. So I think what we’ve seen is a spike and then some declining effort after that spike. So I think what we’re seeing is that these platforms are recognizing that this is an existential threat for them and need to be dealing with AI content. Not yet. I think it’s going to be a constant battle for them. And it’s also tricky because…

Jeffro (13:01.293)
Mm -hmm.

Jon (Originality.ai) (13:21.422)
is all like, if, if I wrote a review, I’m like, that sounds stupid. And then said, chat, you can make this sound better. It would get tagged as AI, but it’s really what I said. So it’s a, it’s a tricky, it’s tricky for them to sort of handle that. And they want more reviews. They would like, if they could choose between more reviews or less reviews on their platform, they would choose to have more reviews. And so they don’t want to cut out good, useful reviews that are really from somebody. But then they do need to make sure that it’s not just going to get overrun by AI that no one can.

can have confidence in. So I think they definitely care about this problem a lot and are working to try and address it.

Jeffro (13:57.389)
Mm -hmm.

Got it. That makes sense. Another question I had is if, you know, chat CPT is going to keep getting better, all these other platforms are going to keep getting better. So these scanners are going to have a harder time identifying AI generated content. So what’s going to happen there? Is the false positive rate just going to be higher?

Jon (Originality.ai) (14:17.774)
Yeah. Yeah, it’s a great, it’s a great question. and it’s a question that we’ve thought about a ton. hard to, so, so I think there’s two cases, and how the world plays out. We don’t know. I’d say I’m probably like 60 % confident in one option, 40 % in the other, the sort of 40 % confidence, like hard to bet against exponential progress within generative AI. When GPT 10 comes out, are you going to be able to tell the difference between it or not?

Jeffro (14:43.245)
Mm -hmm.

Jon (Originality.ai) (14:47.054)
That’s a pretty logical statement. I’d say that’s like the 40 % probability chance. I think the more likely chance, at least in the near term, like five years timeline, is that what we’re seeing in the last two years is that the rate of progress of these LLMs is phenomenal from the standpoint of usability, base level intelligence.

But the way that it is constructing content has not changed a ton since GPT -3, 3 .5. What we used to see was every year, like GPT -2 to GPT -3, the model efficacy dropped off dramatically. And any new model, we would see a significant drop off. As each new model has come out, our drop off in efficacy has shrunk.

and meaning that we’re continuing to be fairly accurate on every new model that is coming out. So we were 99 % on GPT -4 turbo, GPT -4 -0 came out, we dropped to 96%. We’ll close that gap again. But whereas we used to drop from like 99 to like 70 % or 60%. And so the sort of logic that behind that and sort of why I say this is a more likely scenario is that these models are consuming all the data that is available.

Jeffro (15:50.157)
Hmm.

Jon (Originality.ai) (16:12.302)
They are being trained on the same machines, the same transformer technology, and they are all sort of coming to an average style of writing that is recognizable and is challenging to move too far away from. So what we’ve seen is that adversarial attacks have, our ability to close the gap on detection has been outstripping the model’s ability to create.

Jeffro (16:29.515)
Mm -hmm.

Jon (Originality.ai) (16:42.53)
diverse set of type of writing. And that’s also not what the models are focusing on. The models like the big LLMs aren’t focusing on like, hey, we want to be more different in our writing style. We want to be like, no, our writing is near world -class and we want to have more knowledge that is usable in more and unique ways with fewer errors.

Jeffro (17:06.925)
Right. So they’re not trying to infuse occasional errors to make it look like it was written by a human. They’re just trying to be better. Right. Okay. That’s interesting. So what about, so your scanners are that good, right? That you can identify stuff almost, you know, high 90 % of the time. Is there a way, do we get badges on the content that we publish it and say, Hey, this was verified original. And then if so, what about, you know, could I go back and change it? Does it?

Jon (Originality.ai) (17:14.444)
Yep. Yep.

Jeffro (17:37.165)
Did I get the badge and then I change it with something else afterwards? How does that work?

Jon (Originality.ai) (17:39.79)
Yeah. Yeah, we thought about badges. I mean, we’d love to be the standard. Like, right? That’d be such a great move if we could become the standard. I don’t think it’s the right play, because I don’t want to say… I think there’s a place for it, but I also don’t think that… Like, we use AI content on our website. We have…

Some of our most insightful, helpful posts are our AI research leads who are English as a second language who use generative AI to help them write their content. If we have put a badge, we’re going to put a badge on our sort of, call it fluffier marketing content, but then our hardcore net new information into the world content is going to not get the badge because it was AI generated. It doesn’t feel right. So I’m…

Jeffro (18:15.053)
Mm -hmm.

Jon (Originality.ai) (18:32.75)
hesitant to sort of pass judgment on AI bad human good. I think it’s a, we need to understand, I think it’s up to the publisher to choose when they use it and not use it. And so, so that’s why I’m, I like the idea of this, the badge. I don’t think we’re going to, I don’t think that’s ultimately going to be the, the, the path forward, but I do like that idea.

Jeffro (18:38.613)
Mm -hmm.

Jeffro (18:55.437)
Yeah, I think you’re right. It’s gonna be too hard to maintain accuracy and people are gonna find ways around it and all that too. We’ve talked mostly about text content, right, and blog articles and stuff, but what about social media posts or even videos and images now that are being created with AI? Is this a problem there as well?

Jon (Originality.ai) (19:14.574)
So I think it’s a societally more consequential problem. I’ve seen some messages that like, it’s too bad that the 2024 election is going to be decided by an AI video in November. We just don’t know what that video is. And so I think that, I think societally images and video and audio are more harmful, have the potential to be more harmful with mass misinformation, phishing, real -time phishing.

Jeffro (19:28.141)
Yeah.

Jon (Originality.ai) (19:41.742)
scams with, with the, with voice. so I think it’s a potentially societally more significant problem. but I think it’s a smaller, smaller frequency, like the volume of the volume of text outstrips the volume of that needs to be checked outstrips the volume of images, video, and we don’t understand that problem the same way as we sort of understand the world of marketing, digital marketing, content marketing.

Jeffro (20:02.923)
Mm -hmm.

Jon (Originality.ai) (20:10.99)
And that’s, that’s, so we haven’t built those tools. Same sort of like under, like the team could do it the same sort of understanding on how to do it. but it’s a sort of political buzzsaw that we don’t really want to put our head into.

Jeffro (20:19.373)
Mm -hmm.

Jeffro (20:24.653)
Sure, makes sense. So what about, you know, what trends or developments do you see kind of shaping the future of this AI driven content creation and verification?

Jon (Originality.ai) (20:38.03)
I think authorship is going to become a bigger and bigger part of publishing content. where the, the, you know, if I, if I, if I’m reading an article from that, I know it was published by like a newsletter published by so -and -so that I trust and follow and have built up a rapport with.

And I find out it was AI generated. I’m not like, as long as I got something out of it, I’m not annoyed because my guess is that they hopefully also reviewed. And if enough times that happens that they didn’t, then the content is going to drop in quality and I’ll, I’ll stop, stop sort of assigning that level of trust to them. So I think authorship, I don’t exactly know how that plays out, but my sort of suspicion on how do we deal with this world of cyborg writing of, you know, Grammarly is that AI? If you.

except all suggested changes by Grammarly, you’ve essentially used AI to re to paraphrase your, your article. If you’re not a very good writer, like I am, it’s like, I’m rewriting the Grammarly rewriting the whole paper. but it’s still my thoughts and communication that I’m sharing. And so is that bad or not bad? And so I think, I think authorship is the, the, the overall trend on how this plays out around sort of content authenticity and what do we believe and who do we believe?

Jeffro (21:41.931)
Mm hmm.

Jeffro (21:50.957)
Right.

Jon (Originality.ai) (22:03.288)
I think authorship is going to be the theme that plays out over the next handful of years.

Jeffro (22:07.501)
So basically by me stamping my name on something, I’m saying I stand behind it. Whether I wrote every word of it or used AI to help me write it, I’m okay with the final product approving it as from me.

Jon (Originality.ai) (22:18.83)
Exactly. Similar to like celebrities that write books. We know they didn’t do the words. We know, but hopefully, you know, I think there’s maybe a diminished level of assigned value to a ghostwritten autobiography. But I think it’s still valuable.

Jeffro (22:35.661)
Right.

So let’s bring this back to our listeners as service based businesses. You know, if they want to leverage AI generated content for SEO purposes, what advice would you give them to make sure that there’s this authenticity and compliance with any search engine guidelines that come up around this?

Jon (Originality.ai) (22:56.11)
Yeah. So I think that, I mean, this is who we built this tool for is that the decision on whether or not to publish content on your web with a, on AI generated content on your web, should be the publisher, should be the service. So if you, I own a pool business, sort of with my family, we use some AI generated content on that site, but we want to make, we want to know that we’re the ones that are making the decision to use that. We understand that there’s a risk associated with publishing AI generated content.

And then we are also extracting a fair value from the marketing company that we’re working with because we know that’s AI generated content. I think what any service based business, if they’re working with a writer or a content marketing company or an SEO company, if you’re still paying the same rates that you used to be paying and they’re using AI, that’s a pretty raw deal that you’re getting. And you should be aware of that. You know, I think a lot of people are happy to pay, let’s call it a hundred dollars for an article.

Jeffro (23:32.205)
Mm -hmm.

Jeffro (23:48.909)
Yeah. Mm -hmm.

Jon (Originality.ai) (23:53.71)
You’re not super happy to find out that that was copied and pasted out of chat, GBT in five seconds and published on your site. And so I think under sort of going to step one with whoever you work with on your marketing team to make a decision on, are we going to use AI content under what conditions are we going to use it? Understand that there is a risk associated with using it. And I think the worst case scenario is you’re accepting that risk of publishing AI generated content on your site. You don’t know that you’re accepting that risk.

and you’re paying as if humans were writing it and your marketing company agency, whoever you’re working with is extracting all that value.

Jeffro (24:32.237)
Right, so if, just being aware of what you’re paying for and what you’re getting, right, whether you do it yourself or hiring someone else, just know what you’re paying for. That makes sense. Well, John, thank you for joining me today. AI is always a fascinating topic and it’s impressive to watch how fast things continue to evolve. For those of you guys listening at home, check out the links in the show notes to connect with Jonathan. We’ll have some links for you to check out his stuff, reach him on LinkedIn and all of that.

Last question for you, John. What’s the most important thing that you want the listeners to remember from this episode?

Jon (Originality.ai) (25:03.39)
So I’d say there’s two common misconceptions that I’m always happy to communicate with around AI detection. The world can get pretty binary around things work or things don’t work. AI detectors are not perfect, but they are highly accurate, and it’s worth understanding them. I think there’s a sense that AI detectors are BS or AI detectors are perfect. Neither of those polar extremes are true, that there is an effectiveness to them that should be understood.

And then the second piece is the score. People will often get, read the score of a 75 % chance of being human, 25 % chance of being AI. That the way the classifiers work is that’s a probability that it’s going to rain, probability that it’s, that it is human versus percent of the content that is human versus AI. So I think those are the two most common misconceptions in the world of AI detectors. And I think we’re, you know, the genies out of the bottle and these things can be around. So always happy to try and help.

help educate people on those kind of two points.

Jeffro (26:05.421)
Awesome, well I’m glad those tools exist because they are helpful and needed for all the reasons we discussed. So I know you’ll continue to improve them and all of that as well. So thanks again for being here, Jonathan. Thanks to all of you for listening. Keep on dominating and we’ll see you in the next episode. Take care.

Jon (Originality.ai) (26:20.718)
Thanks, everyone.

© 2016 – 2024 FroBro Web Technologies

27472 Portola Parkway #205-241, Foothill Ranch, CA 92610

info@frobroweb.com | Privacy Policy

Scroll to Top
FroBro Web Technologies