Blog #3: Exploring Privacy

Our public personas we can curate ourselves to the best of our ability, but our private ones we cannot. They are where the rubber meets the road in terms of our digital persona because they bleed into real life. What I'm interested in is this: The goalposts of what is considered private information have been moved to a degree in which data that we used to consider private (our location, our status of being logged in to certain sites, our web history, even recordings of our speech on certain devices) is now being used to actively inform the content we receive via algorithms. Does this create a feedback loop in which the content we are supplied based on our own private preferences begins to influence ourselves in real life? 

That's a bit confusing, but after reading Robin Linus' webkay blog on just how much information is gleaned from simply opening our browser, it got me thinking about how this type of feedback loop becomes created. One that surprised me is how Google can see what websites you're logged into. 

Can Google tailor content based on the type of social media you log into the most? For example, professional content for people who use LinkedIn all the time? Or more "artsy" content for someone who enjoys Pinterest?

An easier (and much funnier) way to grasp what I'm talking about is through this sketch video: 


A lot of our personality, like it or not, is identified in public by our taste and the media we consume. What I like about this video is that it suggests that the internet uses our private preferences to create things we publicly enjoy, which we then begin identifying as our own private tastes we "discovered". In reality, it is being fed to us. 

Private Information for AI Models

This is where it really starts to get murky; I read this article from Scientific American on how our personal information is used to train AI models, something I kind of knew was happening, but didnt want to admit. In the article, they say that" Amazon, too, says it uses some voice data from customers’ Alexa conversations to train its LLM." This seemed like a flagrant invasion of privacy to me and I questioned the legality of it until I saw that companies even deploy web scrapers that can bypass paywalls for data and content. 

That’s when it hit me—most of us agree to this kind of data use without even realizing it. This TIME article points out how unreadable and overwhelming most Terms of Service agreements are, basically designed so we’ll just click “accept” and move on. That’s a huge problem, especially in the library science field, where we're supposed to be advocates for privacy and informed access to information. If we don’t fully understand how our own data is being collected and used, how can we guide patrons to protect theirs? It’s no longer just about guarding your library card number—it’s about knowing that every digital interaction could be feeding a system you never consented to. Being aware of this stuff helps us build better, safer learning environments and teach others to do the same.

Citations: 

Thompson, C. (2023, August 30). Your personal information is probably being used to train generative AI models. Scientific American. https://www.scientificamerican.com/article/your-personal-information-is-probably-being-used-to-train-generative-ai-models/


Ghosh, D. (2017, February 15). Why you should care about those terms of service agreements. TIME. https://time.com/4673602/terms-service-privacy-security/


Linus, R. (n.d.). What every browser knows about you. WebKay. https://webkay.robinlinus.com/

Comments

Popular posts from this blog

Social Media Connections Puzzle!

About Me

Blog #5: Data Mining