Skip to main content

A Mistake In Binary Search / Making Three Sum More Efficient (Quick Note)

Not a lot of updates for today.  This morning I worked through a little bit of the algorithm course, including coding a very sloppy version of binary search.  The Java...I'm still learning the Java.  But the main problem I was having was my binary search getting stuck in an endless loop.  It was because I was updating my high and low values improperly.  What was happening was that I was setting my low as the average of the difference between high and low, rather than setting my low as the current low plus the average of the difference between high and low.  But once I figured that out I saw a much nicer implementation from the course instructor himself.  The big take away from the lesson, though, was that you can use binary search, which is pretty efficient (I forget exactly how efficient -- something something log N), to make the three-sum algorithm much more efficient.  Instead of searching through loops j, k, and l, for every unique pair j and k, you should do a binary search for an l such that

j + k + l = 0

or, equivalently,

j + k = -l

So that's the big idea that helps make the three-sum algorithm much more efficient.

Other than that, had lunch with some tech leads via networking (TM).  But it was just a meet and greet (with expensive wine that I didn't have to pay for thank the lord).  They remarked in passing after I'd introduced myself, to the man who introduced me, "Oh but we need senior developers!"  Just a little joke -- but I think I took it in stride.  And I talked a lot.  Maybe too much.  At least I got lunch out of the deal...and met a developer who is also reading Das Kapital -- which is something you don't find everyday, I think.

Turning in.  See you next time.

Comments

Popular posts from this blog

Getting Geodata From Google's API

The apps I'm going to be analyzing are part of Dr. Charles Severance's MOOC on Python and Databases and work together according to the following structure (which applies both in this specific case and more generally to any application that creates and interprets a database using online data). The data source, in this case, is Google's Google Maps Geocoding API.  The "package" has two components: geoload.py  and geodump.py .  geoload.py  reads a list of locations from a file -- addresses for which we would like geographical information -- requests information about them from Google, and stores the information on a database ( geodata.db ).  geodump.py  reads and parses data from the database in JSON, then loads that into a javascript file.  The javascript is then used to create a web page on which the data is visualized as a series of points on the world-map.  Dr. Severance's course focuses on Python, so I'm only going to work my way through ...

The Jump Algorithm

Meetup Went to a Coding Whiteboard Meetup tonight.  It was pretty great.  One of the leaders was even a CS master's student.  At first, honestly, I felt a little bit frustrated, especially because everyone around me seemed to be using pretty high level concepts / approaches that I wasn't familiar with.  But I found someone and relentlessly talked him through his approach until we both kind of realized there were issues in the problem we hadn't worked out yet.  I guess it just reinforces my feeling that when something seems too difficult, if you can, you need to find someone and force him to explain it to you in terms you can understand.  If the people around you really understand what they're about, they will have no problem and you'll learn a lot (assuming they're patient, I guess).  If they don't, you'll realize you aren't as alone as you thought you were.  Bit of the old Socrates. Problem So imagine you have an array with a bunch of numbe...

Throughput, Latency, and Pipelines: Diagnosis Of A Fallacy

Source here . Latency is the time it takes for a job to complete from start to finish -- for example, if we're downloading a file from a server, we might define the latency of the download as the amount of time it takes from the initial request for the file to the time the last byte is received. Throughput is a measure of how much can be completed in a given time.  Following the same example, how many files could we download in an hour? What is the relationship between latency and throughput?   It make take, again, 1 hour to download a file.  Notice already that we have a relationship between some amount of work that can be completed and time -- if the file is, say, 2 GB, it takes 1 hour to download 2 GB in our example.  What we really want to do here is generalize: how long does it take to download 10 GB?  How many GB can we download in ten hours or one minute?  Generalizing over time, we derive latency; generalizing over work completed, we derive the...