Tobi Olabode 27/02/2021 Tobi Olabode 27/02/2021

The answer to for-profit social media: Digital Public Spaces

I read a WIRED opinion piece. Which talked about the need for public digital spaces. Which is social media not driven by profit.

He called them online parks. Digital spaces where you can have a good time like public parks. But you don’t have billboards anywhere telling you to buy some useless trinkets.

This can resemble real parks where you can meet people of all backgrounds. And move us closer to the goal of an open web.

Right now, you meet people around the world on the internet. But with echo chambers, you start to deal with people that think like you and may even look like you.

Funding the New Public Digital Space

The author explains that social media can’t be done for profit. Because of the incentives that being a for-profit provides.

The funding for the non-profit social media service will be an issue. Which the author did mention in his article. That it may need a wealthy backer in the beginning. Almost like a seed round.

Maybe the service can run on donations at the very start. I don’t know.

Also, there needs to be a good selling point for people to use this service. Just saying the service does not make money. May not be a good selling point. But I could be wrong. Signal is very popular right now. Because they are a non-profit organisation. That promise not to collect your data in exchange for money.

Or maybe one can still have a for-profit service. But will need a new monetisation model. Telegram is a good example of this. The app is free to use. But the company makes money by selling stickers on its platform.

Something similar can be done. The video game equivalent of selling skins. (ex, Fortnite) Some of the Asian messaging apps already do this like KaKaoTalk and LINE.

Maybe the service will still want to sell ads. But without the algorithm promoting vitriol and anger. But that’s a massive known. Due to engaging posts mostly defaulting to negative emotions.

I know people give the idea of having a paid service for Facebook. But the fact is the data helps improve the product. So having a paid product in exchange for not being tracked may not be good as the free version. Uncomfortable that it may be. Also, you want the service to be free so many users sign up as possible. Which will increase the value of the network due to network effects.

What the Digital Public Space Could Look Like

While social media is flawed. It has allowed staying in touch with our friends and family. Anywhere in the world.

Imagine living through the pandemic without tech. It will be almost impossible to talk to your friends and family. If they are a large distance away from you.

Social media allowed people to participate in communities while staying at home. (Some arguably bad).

The closest example of what the service can be is Twitter. Which does not have a big market cap as Facebook. But has massive cultural relevance. Dictates the news cycle. A massive networking tool in many industries. Ground zero for the best memes.

The author said the service should be used for discourse. Just with a lower temperature. The main goal may not be privacy, but it certainly helps.

Having a service that allows you to have many of the functions of social media. Without the vitriol will be a good selling point. But that will require strong moderation. Which will upset a few people.

But I guess you have the right to kick out a person at a park for doing damage or harming residents.

A non-profit or non-VC backed service. Does not need to worry about growth to the N-th degree. Almost breaking their product to get more users. This should avoid many of the problems social media companies have faced. As they are more worried about growing. Than fixing many issues that plagued the service.

Issues Facing This Idea

Like I said earlier. I don’t want to make this into a Facebook and Twitter bashing battle. While they are flawed services. I think the world is better with them.

We just need to focus on iterating these services and social media in general in a better direction.

To better align the original vision of a free and open web.

Nothing can be ever perfect. Telegram and Signal are great messengers. Both have issues of dangerous groups starting to use their platform. A couple of years ago Islamic terrorists were using telegram to share propaganda. Signal is seeing a rise of far-right groups using the service.

So this new service won't get rid of all negativity of the web. And it won't be without its issues. But having people using a service without feeling bad about themselves is a noble goal.

People should be allowed to speak about political topics without tearing apart society. And leading to further polarisation.

This online park idea has a big hill to climb. One will need a good selling point, so people join the platform. One will need a way to sustain itself. One will need a way to deal with many of the ethical issues hitting the platforms right now like misinformation and moderation.

The design of the service must be built in a way. To avoid polarisation and vitriol. That will require a lot of thinking. Many times, issues of trolling and negativity are simply of community size. The larger a community gets the more likely they are to be bad apples. And harder to enforce community rules.

If that’s the case, how can a good experience be scaled up to millions of users?

The service will likely to put safety first and foremost. Not an afterthought, unlike a few companies. Maybe integrated into the product team. So, product decisions can be made with safety in mind.

These are my thoughts on this topic. The article is interesting you should check it out. Provides a different model for social media. I don’t think non-profit social media will fix all the problems, but it can help.

Commercial social media is not all bad. It has allowed millions of people to communicate with each other.

…

If you liked this article. Sign up to my mailing list. Where write more stuff like this.

Tobi Olabode 21/02/2021 Tobi Olabode 21/02/2021

Using assert statements to prevent errors in ML code

In my previous post, I showed what I learnt about unit testing. Testing tends not to be thought about when coding ML models. (The exception being production). So, I thought it will be an interesting topic to learn about.

I found one unit test to try out because it solves an issue. I face a lot when coding up my model.

The unit test checks if I’m passing the right shape of data into the model. Because I make this simple mistake from time to time. This mistake can add hours to your project. If you don’t know the source of the problem.

After I shared the blog post on Reddit. A Redditor commented. “Why not just use assert?”

That was something that did not cross my mind. So, I rejigged my memory, by checking out what assert did.

Then started working out how to use it for testing ML code.

One of the most popular blog posts on the internet about ML testing. Uses assertion statements to test the code.

When writing an assertion statement making a function is needed most of the time. This is how unit tests can be made.

Assertion Statement for the Wrong Shape

I was able to hack up this simple assertion statement.

def test_shape():

  assert torch.Size((BATCH_SIZE, 3, 32, 32)) == images.shape

This is even shorter than the unit test I created in the last blog post.

I tried out the new unit test. By dropping the batch size column. The same thing I did in the last post.

images = images[0,:,:,:]

images.shape

Now we get an assertion error:

To make the assertion statement clearer. I added info about the shapes of the tensors.

def test_shape():

  assert torch.Size((BATCH_SIZE, 3, 32, 32)) == images.shape, f'Wrong Shape: {images.shape} != {torch.Size((BATCH_SIZE, 3, 32, 32))}'

This is super short. Now, you have something to try out straight away for your current project.

As I spend more time on this. I should be writing about testing ML code.

An area I want to explore with ML testing is production. Because I can imagine testing will be very important to make sure the data is all set up and ready. Before the model goes into production. (I don’t have the experience, so I'm only guessing.)

When I start work on my side projects. I can implement more testing. On the model side. And the production side. Which would be a very useful skill to have.

If you liked this blog post. Consider signing up to my mailing list. Where I write more stuff like this

Tobi Olabode 14/02/2021 Tobi Olabode 14/02/2021

Stop passing the wrong shape into model with a unit test

When coding up a model. It can be easy to make a few trivial mistakes. Leading to serious errors when the training model later on. Leading to more time debugging your model. Only to find that your data was in the wrong shape. Or the layers were not configured properly.

Catching such mistakes earlier can make life so much easier.

I decided to do some googling around. And found out that you could use some testing libraries. To automatically catch those mistakes for you.

Now entering the wrong shape size through your layers. Should be a thing of the past.

Using unittest for your model

I’m going to use the standard unittest library. I used from this article: How to Trust Your Deep Learning Code.

All credit goes to him. Have a look at his blog post. For a great tutorial on unit testing deep learning code.

This test simply checks if your data is the same shape that you intend to fit into your model.

Trust me.

You don’t know how many times. An error pops up that is connected to this. Especially when you're half paying attention.

This test should take minutes to set up. And can save you hours in the future.

dataiter = iter(trainloader)

images, labels = dataiter.next()

class MyFirstTest(unittest.TestCase):

  def test_shape(self):

      self.assertEqual(torch.Size((4, 3, 32, 32)), images.shape)#

This to run:

unittest.main(argv=[''], verbosity=2, exit=False)

test_shape (__main__.MyFirstTest) ... ok

----------------------------------------------------------------------

Ran 1 test in 0.056s

OK

<unittest.main.TestProgram at 0x7fb137fe3a20>

The batch number is hard-coded in. But this can be changed if we save our batch size into a separate variable.

The test with the wrong shape

Now let’s check out the test. When it has a different shape.

I’m just going to drop the batch dimension. This can be a mistake that could happen if you manipulated some of your tensors.

images = images[0,:,:,:]

images.shape

torch.Size([3, 32, 32])

unittest.main(argv=[''], verbosity=5, exit=False)

As we see, the unit test catches the error. This can save you time. As you won’t hit this issue later on when you start training.

I wanted to keep this one short. This is an area I’m still learning about. So I decided to share what I just learnt. And I wanted to have something you can try out straight away.

Visit these links.

These are far more detailed resources about unit testing for machine learning:

https://krokotsch.eu/cleancode/2020/08/11/Unit-Tests-for-Deep-Learning.html

https://towardsdatascience.com/pytest-for-data-scientists-2990319e55e6

https://medium.com/@keeper6928/how-to-unit-test-machine-learning-code-57cf6fd81765

https://towardsdatascience.com/unit-testing-for-data-scientists-dc5e0cd397fb

As I start to use unit testing more for my deep learning projects. I should be creating more blog posts. Of other short tests, you can write. To save you time and effort when debugging your model and data.

I used Pytorch for this. But can be done with most other frameworks. TensorFlow has its own test module. So if that’s your thing then you should check it out.

Other people also used pytest and other testing libraries. I wanted to keep things simple. But if you’re interested you can check out for yourself. And see how it can improve your tests.

…

If you liked this blog post. Consider signing up to my mailing list. Where I write more stuff like this

Tobi Olabode 12/02/2021 Tobi Olabode 12/02/2021

Can Social Media Stop Misinformation with Media Literacy?

Stopping fire when it starts spreading

I was reading a great interactive article. From growth.design. Which talked about misinformation for the 2020 election. And how Facebook tends to feed the problem. From a design perspective.

We all know that Facebook likes engagement. As it means more people interact with their service. And get to stay on it for longer.

But that’s one of the main reasons why misinformation spreads.

Because misinformation tends to be more engaging than real information. Because of that, the algorithm is more likely to show you something false. Due to the high likelihood of being shared.

When something is highly shared. People are more likely to share it as well. In something called the bandwagon effect.

This reminds me of the content moderation problems. That the tech companies are facing. A lot of work is stopping misinformation before it gets viral.

Lots of people who are experts in this area. Said that most of the damage done is when it starts to pick up steam. Tons of people already viewed the misinformation. And it's hard to delete it. Because people will say the tech companies are overreaching. And may become a story itself. With the Streisand effect.

Tech companies need to work as a circuit breaker. They started to do this in overdrive. As the covid misinformation started to ramp up. So Facebook and YouTube tried their best from stopping covid misinformation from getting out. This was done on the algorithm side.

In the article. On the design side. The article recommended a nice solution. To stop people from blind sharing. Which you get a simple pop up box. Telling you to read the article before sharing. This should let people stop and think. And may stop them from sharing misinformation. Twitter did this for a test. And was able to reduce misinformation on the platform.

Sometimes removing misinformation will require one to make hard decisions. The controversial banning of the former president. Led to a stark decrease in misinformation. By more than 50%.

Misinformation tends to be shared by nodes in a network. So a popular person in the group shared misinformation. Then his fans run with that information. And some of those people will be popular in their own smaller groups. And share the same information. Those fans may share with some friends and family. And that’s how you get your uncle talking about Qanon.

So shutting down a popular node. Is very useful. But can be controversial. So most social media companies opt for shadow banning.

Shadowbanning and it’s disadvantages

YouTube is a great example. With the treatment of borderline content. Which counts Conspiracy Theories, covid denial. Racist videos etc. Youtube simply suppressed those videos. So those videos would not get recommended outside of the audience. This has led to the slow death of these YouTube channels. But has entrenched news incumbents even further. This does not stop misinformation from coming from traditional news channels.

And people who just talk about current affairs in general. Have been hit. Like Philip DeFranco. And other independent YouTubers. And algorithm defaults to showing traditional news channels. Like BBC, CNN, Fox news etc. Because of this YouTube has forced news to have a more establishment bias. Which while more level-headed. Has its biases.

I understand why they did this. As they to get rid of the ranters talking about microchipped aliens. While still providing news on their service. Traditional news networks are known entities. You don’t want to get a PR disaster for recommending a random youtuber providing anti-vax content. The tech companies can’t know all their creators in and out. So, the blanket ban is what they can only do.

But a lot of media literacy can’t just be done by social media companies alone.

It is likely a failure in education.

Social media is only part of the problem

As schools don’t teach kids how to think critically. (NOTE: some problems with critical thinking classes)

But teaching people from a young age about differentiating between different types of media.

Asking questions like:

Knowing if the website is sketchy.

And how to know if an article has any sources backing it up?

But it will be very difficult. In a place like America. Local boards control the curriculum. That’s not bad. But makes it difficult to implement changes like these.

Also lack of incentives for political leaders to back these changes. Do you want a population that can think for itself? And start asking hard questions about your policies. And you’re hiding behind simple slogans. Will become less effective.

I can’t imagine a politician signing up for that.

So while the problem is which deeper and systemic. I think some changes to social media. Can make it act as a firebreak. So it does not fall into violence. Which we saw with the capitol insurrection. If social media can do the job of not making the problem worse. And simply keeping the effects neutral that should be a win.

To recap a lot of changes that social media can do:

Adjust their algorithms.

To avoid recommending extremist content.

And simple design changes that allow people to stop and think before sharing content.

Tobi Olabode 07/02/2021 Tobi Olabode 07/02/2021

Why Different Social Media Companies Promote Different People

Have you noticed that social media services are used by different people?

Rich VCs talk about life and wealth on Twitter.

Fashionistas share their work on Instagram.

Your uncles and grandmas are on Facebook.

And your favourite video game streamer is on YouTube.

Why is that?

These social media services have millions of users. So they should house every community you can think of. So, it can’t because of the culture.

The answer has to do with the medium itself. Clothes are inherently visual. You can write an essay about them. But having a picture of it. Gives you all the information you need to know.

Instagram is one of the best places to share photos online. And one of the top places to share your fashion creations.

Professionals on Twitter

Why are their lots of writers on Twitter?

Because Twitter is designed mostly for writing. Even when you share memes. You still need to write something. 250 characters allow writers to compress their thoughts in less words.

If you want to explain more. You can create a thread about the idea. Which acts as a mini blog post.

Twitter is a great place to share things you read with numerous people. And you can give your two cents on the situation. While sharing it.

People tend to share their longer writing work on Twitter. Like blog posts on their website. Or their newsletter. And generate excitement over Twitter. This may help explain why Twitter bought out a Substack competitor Revue. So, they can integrate it into the app. As many newsletter writers, Twitter is their main acquisition tool.

This may also explain why the network has a lot more professionals. A lot of American coastal users use the service.

Many people mention that Twitter is there best networking tool. As they share ideas that people in their industry find valuable. And people use direct messages to start personal connections.

Twitter has a massive interest with white-collar professionals. And mainly city dwellers use that service.

This is why there is an oversampling of professions such as; journalists, VCs, programmers, marketers, writers, tech founders etc.

No blacksmiths, plumbers, or linesmen.

Journalism part has to do with twitter’s design. As Twitter is the place known to get breaking news. So, Twitter is the place to go when looking for material to write on.

Contrast to Facebook.

Friends and family with Facebook

Facebook is simple. Facebook is made for friends and family.

So it will make sense that your extended family is on there. But your friends may not be on there. Depending on your age. They probably left a while ago. And using other services like Snapchat and Instagram.

Facebook has bought and made more apps for you to talk to friends and family. Like WhatsApp and Facebook messenger. With these services, no one is creating long-form essays on there. As they are designed to for communication between people. Not explaining one’s thoughts of the world.

You can argue WhatsApp is a bit different. Because large groups can work as channels. Like a company communicating with its users. Or news company sharing what’s happening with their local community.

This may explain why Facebook is popular in the American Midwest and other less urban areas. As there is more of focus other areas of life. Rather than a career. (Hence Friends and family focus). Compared to the coastal parts of America. But these patterns show up in many countries.

Facebook has a wide appeal to many people. Mainly older folks.

YouTube, The Entertainment Medium

YouTube is the hub for entertainment. So it is a medium not use for communication. But to share ideas and make people laugh. There is a lack of direct messaging on YouTube. And communication is designed to be one to many. Think of YouTube comments.

Like Instagram YouTube tends to be more visual. Because of the longer time limit. People can experiment more. Like the video essay format. A traditional essay. With highly engaging visuals. To get hooked. But highbrow stuff like that is not as popular.

Lets plays. Or crazy challenges, celebrity vlogs etc. These tend to be highly visual and engaging. As lots of stuff happens in those videos. Great videos to watch when you are bored on a train ride.

Because of the amount of time allotted. A lot of creators have time in the video. To show their sponsorships. Which are basically ads. They also share their merchandise which you can buy.

Due to the size of YouTube. Almost everyone uses YouTube at some point. To a person who wants to learn about fashion. To a person who wants to learn about the periodic table. It’s all on YouTube.

The video just has to be entertaining enough before you click away.

Even education videos tend to be highly engaging. The one’s that are not. Get little views. Or split up into smaller videos. Like lectures.

But there is a growing genre of video that is not visual per se.

Which are podcasts.

Where you simply just watch the host and guest talking on video. I tend to use this a lot. I guess that is more visual than an audio-only podcast. As you get to see the faces of the guest and host. And all their expressions.

A podcast can help a YouTube creator produce much more videos. One hour YouTube podcast. Can be cut into 5 different clips. All linking back to the original podcast. The cost of production is low compared to other genres. Like travel or shopping hauls.

Also, can be less of a time sink on the creator's side. Compared to something like vlogging.

Design affects the medium

The design of the social network affects who uses the platforms. As it incentivises users to use the app in a certain way.

Instagram is for getting likes and followers. So eye-catching content is pushed.

Twitter is for getting retweets. So funny or outrageous content is pushed.

YouTube is for getting views. So outlandish content is pushed.

Because of that:

People with very visual hobbies will get more traction on Instagram.

People who have controversial opinions. Do well on Twitter.

People who are entertaining. Do well on YouTube.

Which self-selects for people with certain personalities and interests. That is suitable for the platform.

All that explains why people thrive of different platforms. And you view your favourite creators on different mediums.

If you liked this article. Sign up to my mailing list. Where write more stuff like this.

Tobi Olabode 07/02/2021 Tobi Olabode 07/02/2021

How to extract currency related info from text

I was scrolling through Reddit and a user asked how to extract currency-related text in news headlines.

This is the question:

Hi, I'm new to this group. I'm trying to extract currency related entities in news headlines. I also want to extend it to a web app to highlight the captured entities. For example the sentence "Company XYZ gained $100 million in revenue in Q2". I want to highlight [$ 100 million] in the headline. Which library can be used to achieve such outcomes? Also note since this is news headlines $ maybe replaced with USD, in that case I would like to highlight [USD 100 million].

While I did not do this before. I have experience scraping text from websites. And the problem looks simple enough that would likely require basic NLP.

So, did a few google searches and found many popular libraries that do just that.

Using spaCy to extract monetary information from text

In this blog post, I’m going to show you how to extract currency info text from data.

I’m going to take this headline I found from google:

23andMe Goes Public as $3.5 Billion Company With Branson Aid

Now by using a few lines of the NLP library of Spacy. We extract the currency related text.

The code was adapted from this stack overflow answer

import spacy

nlp = spacy.load("en_core_web_sm")

doc = nlp('23andMe Goes Public as $3.5 Billion Company With Branson Aid')

extracted_text = [ent.text for ent in doc.ents if ent.label_ == 'MONEY']

print(extracted_text)

['$3.5 Billion']

With only a few lines of code, we were able to extract the financial information.

You will need to have extra code when dealing with multiple headlines. Like storing them a list. And having a for loop doing the extraction of the text.

Spacy is a great library for getting things done with NLP. I don’t consider myself expert in NLP. But you should check it out.

The code is taking advantage of spaCy’s named entities.

From the docs:

A named entity is a “real-world object” that’s assigned a name – for example, a person, a country, a product or a book title. spaCy can recognize various types of named entities in a document, by asking the model for a prediction.

The named entities have annotations which we’re accessing with the code. By filtering the entities to have money type only. We make sure that we are extracting the financial information of the headline.

How to replace currency symbol with currency abbreviation.

As we can see Spacy did a great job extracting the wanted information. So we did the main task.

In the question, the person needed help with replacing the dollar sign with USD. And included highlighting the financial information.

The replacement of the dollar sign is easy. As this can be done with native python functions.

extracted_text[0].replace('$', 'USD ')

USD 3.5 Billion

Now we have replaced the symbol with the dollar abbreviation. This can be done with other currencies that you want.

Highlighting selected text in data

The highlighting of the text moves away from processing data. And more of the realm of web development.

The highlighting of the text. Would require adjusting the person’s web app. To have some extra HTML and CSS attributes.

While I don’t have the know-how to do that.

I can point you to some directions:

Highlight Searched text on a page with just Javascript

https://stackoverflow.com/questions/8644428/how-to-highlight-text-using-javascript

https://markjs.io/

Hopefully, this blog post has helped your situation. And on your way into completing your project.

If you want more stuff like this. Then checkout my mailing list. Where I solve many of your problems straight from your inbox.