Another Quora Math Answer about Split-Complex Numbers

Well, I’ve been on a role with Quora math answers recently! One person asked the question of whether you can have a negative absolute value. In other words, could you have a negative distance between two points! Rather than dwelling on all the rules this would break (who needs rules!?) I decided to construct such a system.

And I found that it looked like something I’ve seen before: the split complex numbers. The one application I’ve seen for these numbers is in an online dating app, as presented at a RecSys workshop in 2012!

In this number system, you have a new special number called “j” which lives outside our usual number system. This number has the special property that j * j = 1.

The dating application works like this:
You have people of the same gender who are similar (positive numbers)
People of the same gender who are different (negative numbers)
People of the opposite sex who are good matches (positive j numbers)
People of the opposite sex who are bad matches (negative j numbers)

These assumptions correspond to mathematical statements (capital letters for people of the same gender, and lowercase for people of the opposite).
If you’re similar A and A is similar to B, you’re similar to B(1*1=1)
If you’re different from A and A is similar to B, then you’re different from B (1)(-1) = -1
If you match a, and a also matches B, then you’re similar to B (j*j = 1)
If you don’t match a, and a matches B, then you’re different from B (-j * j = -1)
And so on!

So here’s someone talking about negative space and mathematical impossibilities, and we end up with an online dating application! Yeah, I realize this is a heteronormative number system* that also reduces human personality to a single dimension – but still it’s pretty cool!

*That must be why it wasn’t part of the Yale curriculum.

Here’s my full answer on Quora
Here’s the RecSys workshop paper I referenced

Post about Polygon Approximations to Pi

So I spent a few minutes tonight writing an interesting math answer on Quora: Is a circle a shape with an infinite number of corners?

The challenge in answering this question is that even if the premise of the original question is incorrect, I feel like I can try to figure out the intuition that led to the question and how that related to more complex mathematical topics.

I’m not sure about the math background of the question writer, or the people who are reading my answer, which makes these kinds of things hard to do. Maybe it’ll be worth my time to do more of these – I often have a hard time finding blog post ideas but Quora has a never-ending supply of content ideas!

Movie Search in Foursquare

Today I want to share a new feature that’s available on the Foursquare app and give a little background on how we got the ball rolling.

First – and I’m curious to see if any of you have different answers – where do you check movie times online? Maybe you use Google or an alternative search engine like bing. Maybe you use a service that specializes in movie times like Fandango or Movie Tickets.

What if you’re on the go? Normally when I want to check the movie times I only have a smartphone. Just opening Google works alright, though it’s a little clunky. It will show a map of each individual theater playing the movie, but you can’t get a map of all theaters and compare!

Fandango can’t seem to build a reasonable app for this – it’s full of popup ads which are horrible when you’re on a tight deadline. Again the maps are limited, and each page is full of flashy ads that really take away from the user experience. Now I’m no designer and I know this is subjective, but check out this train wreck:

 
IMG_3943.jpg
 

So I thought we could do better at Foursquare. I’m already using Foursquare to plan my time when I’m outside the apartment, and finding movie theaters is a part of that. Why not be able to check the movie listings when I’m already figuring out which restaurant or bar to go to?

Foursquare already has:
– A great interface for searching.
– A way to see all results on a map.
– No ads that get in the way of completing the goal
– Uber integration. Is anyone prone to choosing movies that start in 5 minutes? “Come on we’ll get there before the previews are done!”

We’re just missing the ability to buy the ticket – but who knows maybe we’ll get that one day!

My role in this is small, but I want give a little insight on how a demo can help. We’ve been importing movie times from an external data source for awhile, which allowed people to check in to movies on Swarm. We even had the movies listed in our search indices! All that needed to be done was build a page where you can search for the data.

When hack day rolled around (that’s a day where engineers at Foursquare pick up on these kinds of projects that are outside their main area of focus), Stephanie Yang (@stpyangblog) and I decided that we were going to build this demo. It took us a little longer than the day we were supposed to use, but it came out looking like this:

Displaying IMG_3943.PNG
Screen Shot 2015-09-22 at 11.31.01 AM

Screen Shot 2015-09-22 at 11.31.01 AM

Screen Shot 2015-09-22 at 11.31.31 AM

Screen Shot 2015-09-22 at 11.31.31 AM

There were 3 key elements to our minimal product:

  1. Show me a list of movies and movie times in my area. We had pictures and descriptions lying around, as well as an idea of how popular each movie was (thanks Swarm checkins!) in order to do the rankings.

  2. A search box with autocomplete. This is so important! As you type, a list of potential movie matches comes up. This gets you to your search faster and prevents spelling errors and similar problems that can come up.

  3. A search results page that will show you the information for the movie you want.

The pages that we designed were usable, but Stephanie and I spend most of our time on data science and backend engineering, not front-end engineering and design. And we only had a day or two – so I’m really happy with the output.

Now usually these demos just kind of sit there and nothing comes of it, but the team working on search quality liked the demo and saw people actually searching for movies in the app (I bet they were really annoyed when it didn’t work!) So, they decided to put it in the app.

And the result is great! In my opinion, Foursquare is now the best app to use when you’re searching for movies. Here are some screen shots – note that when I start typing in a movie it immediately comes up on autocomplete. The search results are laid out on a list page that we’re all used to seeing. You can click through to see all the venue details for each theater, which Foursquare is already good at. And finally, if you tap the map icon, there’s your map of all the places you can go!

IMG_3944

IMG_3944

IMG_3945

IMG_3945

IMG_3946

IMG_3946

In terms of total work, my role in this was pretty small, since the sequence of events looks like this:

  1. Years of engineering work and infrastructure by other engineers

  2. Our 2 day hack project

  3. Weeks of analysis, coding, and testing by other engineers

But I'm just glad we were able to get that feature up there! What are the takeaways from this experience? I'm not sure, but off the top of my head, here are some things that helped:

  1. We were working on a feature that we personally wanted to use.

  2. Most of the background work had been done (we just had to hook up the last 5%)

  3. The feature fits in well with our current product. It doesn't distract from any of the other use cases and only comes up in autocomplete when we're reasonably sure you're after a movie.

If anyone out there gives it a try - let me know how it goes!

News Corporation Sells Amplify

I worked at Wireless Generation early in my career. It was the education company that was bought by News Corp in 2010 and became Amplify.

You can find some articles on it here and here, but the short story is that after 5 years at News Corp, the company wasn’t performing as well as they had hoped, and they sold it to private investors.   There were also massive layoffs.

Wireless Generation/Amplify is a data driven education company. A large part of the focus when I was there was helping teaching in the early elementary school years ensure that all of their students learn to read and learn basic math skills. It goes without saying that getting this right for kids is really important, and in the 2000s we were starting to see internet-scale data on this for the first time. Sometimes at Foursquare, we’ll improve some click-through-rate by 1%. But if you improve reading-rate by 1%, think about how many lives are changed for the better!

Wireless Generation also built some of the first open-source curriculum for the internet, and it looks like Amplify now has a really fun math game. The positive side of my experience there, along with some talented employees, was around the products we were building.  The downside was that despite rhetoric to the contrary, management style was much too top-down for my liking. Working on a large contract for the NYC DOE was particularly painful since decisions were made by government bureaucrats and were sometimes politically driven. I can probably write several posts on frustrating times at Wireless Generation!

I can only speculate on what went wrong, and even if the articles had gone into more detail I would be certain that the story from the inside is completely different. What is it that News Corp miscalculated? As far as I can tell, there weren’t any major setbacks.

It turns out that while the original sale was going on, I was taking a course on business strategy at NYU. I had emailed my thoughts to the professor, and I was able to find them. This is an excerpt from December 2nd, 2012, printed as is (along with some awkward phrasing!)

A lot of people are asking why news corp want to get into education.  Clearly, the newspaper business is not a great one to be in right now – maybe they feel like they need to do something.  I feel that they might might want the company because they want expertise in digital content distribution (I developed hand-held and web applications while I was there).  However, they’d probably be better off just hiring a much smaller team to do that.  Someone said this is just about Rupert Murdoch trying to build a legacy.  I don’t know.

Other people are asking whether it will work.  They’re keeping the same management team.  I have a feeling that WirelessGeneration‘s growth is now going to be heavily subsidized by news corp.  But after Monday’s class discussion I’m wondering what news’ other businesses will get out of it.

Of course, I still didn’t mind getting a check for shares.

Because their products are so important, I hope that Amplify can refocus in the future. I can see a few things going for it:
– It will be under the leadership of the original founder, Larry Berger. He is a capable leader, knows what he’s doing in education, and I’m sure he’ll have big plans.
– It will have a smaller, more focused team. If it has 400, it’ll be the same amount as when I was there.
– The ownership will be private. There will be no parent corporation in an entirely different industry trying to steer the agenda.
– If they get to keep their amazing office space in DUMBO.. can’t beat that!
This is on the patio – taken by me in 2007.

It sounds like they are going to try to refocus on the original mission, this time with a much more experienced team. I’m feeling a lot more optimistic for them then I was 5 years ago – this could be the low in a turning point of sorts!

Also – I know this could be a difficult time for people who are still working there, so I wish you all the best. If you are a current employee or recently laid off and you need a new job, I’d be happy to meet up and show you a demo of what we’re working on at Foursquare. I have a lot of respect for anyone who is working through all the issues at Amplify, particularly the engineering and product teams.

Of course, I still didn't mind getting a check for shares.Because their products are so important, I hope that Amplify can refocus in the future. I can see a few things going for it:- It will be under the leadership of the original founder, Larry Berger. He is a capable leader, knows what he's doing in education, and I'm sure he'll have big plans.- It will have a smaller, more focused team. If it has 400, it'll be the same amount as when I was there.- The ownership will be private. There will be no parent corporation in an entirely different industry trying to steer the agenda.- If they get to keep their amazing office space in DUMBO.. can't beat that!This is on the patio - taken by me in 2007.It sounds like they are going to try to refocus on the original mission, this time with a much more experienced team. I'm feeling a lot more optimistic for them then I was 5 years ago - this could be the low in a turning point of sorts!Also - I know this could be a difficult time for people who are still working there, so I wish you all the best. If you are a current employee or recently laid off and you need a new job, I'd be happy to meet up and show you a demo of what we're working on at Foursquare. I have a lot of respect for anyone who is working through all the issues at Amplify, particularly the engineering and product teams. 

Appearance on BK Live for Beyond Coding

Over the past few months, I've been involved with a program called beyondcoding.io, which is a series of courses at New York City tech companies designed to develop career skills for people who are entering the tech industry.

I produced one of the classes with Maryam (see our NLP talk) at Foursquare on technical communication. I would describe it as a workshop that involved talks by Maryam and myself as well as some audience participation and group communication.

On Tuesday, I was interviewed on BK Live, which is a show on BRIC TV (a Brooklyn-focused television station in New York). Also appearing on the round table was Bethany Marzewki who works at stack overflow and did a great job organizing Beyond Coding with the New York Tech Talent Pipeline. This also features program students Keanna Hines and Shlomo Maghen!

Ok enough of the background, here's the full video!

A few notes on what I said:

  1. I really wished I played up Foursquare and Swarm more, especially for high school and college aged people. I mean - if that were available to me when I was in high school or even much younger, I'd be all over it! For one, I'd compare checkins to the (now defunct) North and South cafeteria in Weston High School. I was sort of vague on whether Foursquare or Swarm would be something that could get a teen interested in technology. Obviously there are certain applications, like do-my-taxes applications that isn't going to get anyone interested. I should have put Foursquare on Swarm squarely on the other side of that!

  2. I don't condone cheating.. seriously. Good thing I'm not running for office, or that'd be a great campaign ad for my opposition. I think modding games, though, and learning how to hack code in the process, is great. So long as you're not betting on the games.

Brooklyn Neighborhood: Sunset Park

Before I get into some of the more technical or idea-oriented posts, I want to practice by talking about one of my day trips last week, and that was to Sunset Park, Brooklyn.

Now last week I actually took a week off from work, and it seems like every time you take time off you’re expected to go someplace exotic or at least visit someone in another city. Instead, I opted to stay at home.  This is apparently known as a “staycation”!  As it turns out, there’s so much to enjoy about living in Brooklyn.

I can write a whole post on just how this “staycation” went, but instead I’m going to start a series on Brooklyn neighborhoods. On Wednesday, I decided to go to the closest neighborhood that I haven’t checked out yet, which was Sunset Park.

I went in the early afternoon and I had a few hours to look around. I rode a few stops on the Bay Ridge-bound R train from Dekalb to 45th street. When I got out, there were rows of shops – including some Mexican grocery stores, as well as some interesting architecture on a nearby Catholic Church which really stands out on 4th Ave. The cross streets are nice Brownstones.

IMG_3257

IMG_3257

IMG_3258

IMG_3258

So then of course I had to get to the park itself. Starting at the base of the park, it doesn’t look like much. Once I walked up the steps, I could see there was tons of activity for a Wednesday evening. The park was full of picnics and sports. It was difficult to find the much talked-about view of Manhattan at first, but once I moved into the right position it was pretty amazing. You get Downtown Brooklyn, Downtown Manhattan, Jersey City, and the Statue of Liberty all in one shot!

IMG_3261

IMG_3261

IMG_3271

IMG_3271

IMG_3276

IMG_3276

In that last photo, I can actually see the apartment building where I live, which is the one reflecting sunlight. The view doesn’t go both ways – I still can’t see the park very well from the roof of the building! In the photo, you can also see nearby One Hanson Place which is the big clocktower all the way to the right.

This is a pretty unique view of Manhattan! I’ve seen the skyline from many different angles, but this one actually looks like you’re looking “down” at the city.

So finally I needed to stay for the Sunset. It turns out that they don’t call it Sunset Park for nothing. I’ll let the pictures speak for themselves.

IMG_3264

IMG_3264

IMG_3273

IMG_3273

IMG_3282

IMG_3282

After that I needed to get back, but I wanted to grab dinner first. Foursquare tried to get me to go to Tacos Matamoros but that was mainly a sit-downplace , so I ended up getting take out at Tacos El Bronco. It was very good, and obviously the neighborhood has no shortage of Mexican food!

Overall, I think this is a great place to spend the day or a few hours if you want to explore a lesser-known (especially to tourists) neighborhood of New York City. It’s also a fairly easy ride on the subway. Perhaps I’ll follow up with my take on other neighborhoods in the future!

Kicking off a Blog

Hi everyone!

I’ve been meaning to set up a blog for a while, and I finally sat down and took a few minutes to put it up. Much more difficult that the technical side of it is the content or “product” side. What will my new blog’s name be? What will it be about? How often will I post?

I decided not to let these questions get in the way of actually getting started. But I do in fact have some ideas based on my interests and level of knowledge. So, I listed some topics on my side bar to give me some ideas:

  1. Data Science: I’ve been doing this for about 4 years now at Foursquare, and I’ve spoken at conferences and meetups. Perhaps I can write some posts that explain how machine learning works to a general audience and how it’s integrated into all the applications that we use every day.

  2. Recommendation Systems: This is also what I do at Foursquare! Also software engineering, product design, NLP, etc.

  3. Technology innovation and projections: I like to take a look at the landscape of products being released and proposals being made by entrepreneurs and try to understand where opportunities for change could be, as well as try to predict how our daily lives will be different in the future.

  4. New York City: I live here! And if you read my twitter feed, you’ll see that I have a lot to say about it. What are all the different neighborhoods to explore? What about all these high rises going up in my Fort Greene neighborhood? I even took a few shots at the Mayor’s Uber proposal.

Now, I’m a little concerned that people interested in this topic may be less interested in the tech stuff and vice versa, which could confuse my blog audience. But I won’t have to worry about that unless I actually GET an audience.

So that’s kicking it off. I imported my old Tumblr posts earlier today. I still need to work on design. Any more ideas and  tips are welcome!!

Now hopefully if you come back in a year this post will be at least a few pages down.

Introducing Foursquare for the iPad, the best way to plan your holiday travels.

Introducing Foursquare for the iPad, the best way to plan your holiday travels.

Natural Language Processing at Foursquare

Last month, Maryam Aly and I gave a workshop for NYU tech week where we spoke about how Natural Language Processing is integrated into the Foursquare app and our technology stack. Later, we gave the students a hands-on introduction to the nltk toolkit.

Hakka Labs took video of the first part of the workshop and posted it. Here is it:

https://www.hakkalabs.co/articles/introduction-natural-language-processing

On this 4sqDay, we’re celebrating our amazing global community of superusers. Happy 4sqDay!

On this 4sqDay, we’re celebrating our amazing global community of superusers. Happy 4sqDay!

Digging into the Dirichlet Distribution

This is a link to my talk on the Dirichlet Distribution at the machine learning meetup:

http://www.hakkalabs.co/articles/the-dirichlet-distribution/

The open source project I’ve referenced lives here: https://github.com/maxsklar/BayesPy Feel free to jump in if you’re interested!  I have a paper on it that unfortunately did not get accepted to aistats (they cited lack of impact; I disagree).  I’ll try to fix it up and get it on arxiv in the next few weeks.

Terrible statistical writing: NatGeo Global Warming Article

I find it really hard to inform myself on certain issues when the mathematical arguments presented in articles just make no sense.

I was reading this recent article by national geographic called “does the ‘global warming pause’ debate miss big picture”

It starts out by stating that there’s been a decrease in the rate of global warming in the 21st century, but cautions readers that this doesn’t mean global warming has stopped.  So far, so good - I understand what argument is going to be presented.

Then it links to a research paper that that says the slowdown in global warming is caused by the el nino cycle in the Pacific.  I can’t speak to this analysis since I don’t know enough climate science, and I don’t have the actual data to analyze.  But at this point, I’m still happy with the article, I think their point is very clear.

Okay - now a little bit further down we get a quote from Gavin Schmidt at NASA.  He doesn’t read too much into the pause, and says “If you take 1998 out, there is no pause”

Wait a minute!! I hope I’m not alone here - I think anyone who analyses data should start to get a worried here.

First of all, if that were true, any analyst who says they saw a global warming pause is not doing their job.  If you see a pattern, and taking away one data point removes the pattern, your model is not robust enough to be trusted.

Secondly, this totally contradicts what’s been said earlier in the article.  Just in the last paragraph, we are told “there’s no denying that temperature has plateaued in the last decade” (which doesn’t include 1998).  And in the paper about el nino that they posted, the abstract starts by stating, “the annual-mean global temperature has not risen in the twenty-first century”.

So either those two sources are wrong, or Schmidt’s quote is wrong.  But the article continues along as if nothing is amiss.

Then in order to support Schmidt’s claim: “the ten hottest years since 1880 have all happened since 1998, with 2010 being the hottest of all” But the argument is that the rate of global warming in the 21st century has dramatically slowed, not that it’s cooling.  So of course we’d expect recent years to be warmer. Doesn’t make Schmidt’s case at all - they’re confusing the first derivative with the value.  Even if the derivative went to zero (an “exact” pause in the global temperature) - sounds like the data may be consistent with that.

Anyway, maybe I should do something else on a Sunday night.  Anyone see Breaking Bad? I still have to catch up.

DataGotham 2013 Talk, or Japanese vs Russian reviews

/I gave a talk recently at DataGotham (http://www.youtube.com/watch?v=1KfK0zOSo5U), and I’ve gotten a lot of questions about one particular/tr stat that I gave in that talk.  If you write a tip in Russian, then you’re 3 times as likely to hate the place than if you write it in Japanese.  Where does that come from?

Well, I don’t mean that the same person is going necessarily going to come to the same conclusion just by switching language - although that would be a neat experiment to run.  What we did was first categorize our Foursquare tips by language, and for each tip we looked at people who liked and disliked the venue.  This wasn’t done by language and not country, because for sentiment analysis we build a different model for each language (a negative english tip is a negative english tip anywhere).

It should also be pointed out that we’ve ignored tips that are written without an explicit review (even though we still do sentiment analysis on those).  This ratio is simply negatives / (negatives + positives).

So it turns out that there’s a correlation between language and the type of review received by the venue.  The reason for this is purely speculative, but some have suggested cultural differences.  I’m open to hearing other hypotheses.

A few things to note: the data is overwhelmingly positive.  Even Russian speakers are over 90% positive.  Japanese and Russian are the two outliers among the languages we considered that really stuck out.  The rest of the languages kind of bunched up in the middle.

Here’s a graph with all the languages I considered (some data had to be cut down for the data gotham slide).  The 2-letter languages codes are from 

image

Here are the actual percentages:

japanese 3.25%
german 4.72%
dutch 4.79%
italian 4.84%
thai 5.44%
indonesian 6.16%
korean 6.37%
english 6.98%
spanish 7.14%
turkish 7.27%
french 7.31%
portuguese 7.55%
arabic 7.84%
russian 9.81%

Foursquare’s recent Tumblr post on the issue

Casino Random Number Generator

Here’s how it works: the outcome of each round of the game is either 0 or 1.  Before the outcome is decided, players place bets on either side.  The total amounts bet on each side are confidential.

After betting is closed, the outcome is calculated as the one with the least amount bet on it.  The losers get nothing, the winners double their money, and the casino takes the rest. For example, if there is $100 bet on 0 and $120 bet on 1, the outcome would be 0, there’d be $200 in payouts, and $20 to the house.

Any conceivable pattern in the outcomes will be obliterated, and expectations by the players will become self-defeating prophesies.  In other words, if we could get people to play this game, the outcomes would be about as random as you can get.

Would this work?  What are the flaws?