Archive for the ‘palantir’ Category

Software Dev Intern Projects – 2011

February 20th, 2012 | Ari Gesher

Palantir Technologies Interns, 2011

As we roll into the peak of internship season, it seems like a worthwhile time to talk about just what it is that interns do in their time at Palantir: our software engineering interns are full members of the development team from the day they arrive. During their time with us, they design, implement, and test their projects while alongside the full time engineers.

We believe in hitting the ground running; in fact, before you get your badge on the first day, you must commit code. But don’t worry, this all takes place under the watchful eyes of your mentor: a full-time engineer who’s there to guide through everything from our development environment, how to write specifications, to software architecture, to how to figure out which cafe has the best lunch.

You can broadly break down the projects the Palantir interns worked on into a number of categories:

Join Us In 2012!

We’re currently hiring for our 2012 intern class as well as full time positions. For those interested in working on the sorts of projects mentioned here, you’ll want to apply for this job:

For other internships, check out our open positions page.

From those from non-traditional schools that have co-op programs that run during the year, we offer those as well. Go ahead and apply for the appropriate intern position and make a note of when you’d like your co-op to be in your application.

Read on to dive into a large sampling of software engineering projects our interns worked on this past summer.
Read the rest of this entry »

Introducing Palantir’s first open source releases

December 14th, 2011 | Ari Gesher

Palantir Technologies Open Source

We’re big fans of open source. Libraries from Apache, Google, and various projects hosted on SourceForge.net make up a significant fraction of the third-party code we use to build our products.

We’re proud to be making our first set of open source releases with these two projects: Cinch and Sysmon.

We think it’s the right thing to do, to add our voice to the chorus of developers making software available to freely use, modify, and distribute. These two projects represent our first dip into the open source water – we’re just getting started. As time and other interests allow, we’ll be making other projects available to the dev community.

We’ve chosen the Apache License, Version 2.0 to make our contributions as free from encumberance as possible – our hope is that many people will find them useful and build on top of them just as we have with our own software.

The Projects

code editor showing Cinch annotations

Cinch – Cinch makes MVC in Swing easy

Cinch is a Java library for simplifying certain types of GUI code. When developing Swing applications it’s easy to fall into the trap of not separating out Models and Controllers. It’s all too easy to just store the state of that boolean in the checkbox itself, or that String in the JTextField. The design goal behind Cinch was to make it easier to apply MVC than to not by reducing much of the typical Swing friction and boilerplate. Cinch uses Java annotations to reflectively wire up Models, Views, and Controllers.

Already in heavy use inside the Palantir Government product, Cinch changes GUI development in Java to be similar to iOS and OS X’s Cocoa, where annotations are used to bind controls to fields.

Graph of CPU usage over time

Sysmon – A lightweight platform monitoring tool for Java VMs

Sysmon is a lightweight platform monitoring tool. It was designed to gather performance data (CPU, disks, network, etc.) from the host running the Java VM. This data is gathered, packaged, and published via Java Management Extensions (JMX) for access using the JMX APIs and standard tools (such as jconsole). Sysmon can be run as a standalone daemon or as a library to add platform monitoring to any application.

Originally built as component in our Palantir cluster monitoring server, this project should be helpful in scenarios where you need to get data off a host platform and into a VM.

Let us know how we’re doing

We’d love to hear from you on how we’re doing. Aside from the normal outlets to communicate about the projects themselves (see the mailing lists and issue trackers for each project), please feel free to email me directly, Ari Gesher, as the curator of these projects.

How to Rock a Systems Design Interview

October 28th, 2011 | John Carrino

Compiler design dependency comic, originally from http://www.xkcd.com/754/
Comic courtesy of XKCD, via Creative Commons License

Note: this third installment in our series on doing your best in interviews. Previously: “How to Rock an Algorithms Interview” and “The Coding Interview”.

One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that code needs to run on a computer somewhere — and then things quickly get complicated. A truly unbelievable amount of complexity lies beneath something as simple as visiting Google in your browser. While most of that complexity is abstracted away from the end user, as a system designer you have to face it head on, and the more you can handle, the better.

At Palantir, many of our teams give a systems design interview along with an algorithms interview and a couple of coding interviews. We don’t expect anyone to be an expert at all three disciplines (although some are). We’re looking for generalists with depth — people who are good at most things, and great at some. If systems design isn’t your strength, that’s okay, but you should at least be able to talk and reason competently about a complex system.

Read on to learn about what we’re looking for and how you can prepare.

Read the rest of this entry »

The Coding Interview

October 3rd, 2011 | Allen Chang

Einstein Coding Interview Joke Image

Note: this part is part two of our series on doing your best in interviews. Part one: “How to Rock an Algorithms Interview”.

Here at Palantir algorithms are important, but code is our lifeblood. We live and die by the quality of the code we ship. It’s no surprise, then, that coding ability is what we stress the most in our interview process. A candidate can get by with mediocre algorithm skills (depending on the role), but no one can skimp on coding.

Suppose you’re confident in your ability to write great software. Your task in a coding interview (of which there will be several) is to show the interviewers that you in fact do have the programming chops — that you’re an experienced coder who knows how to write solid, production-quality code.

This is easier said than done. After all, coding in your favorite IDE from the comfort of $familiar_place is very different from coding on a whiteboard (on a problem you’re totally unfamiliar with) in a pressure-filled 45-minute interview. We realize that the interview environment is not the real world, and we adjust our expectations accordingly. Nonetheless, there are a number of things you can do to put your best foot forward during the interview.

First, though, we’d like to give you a sense for what we look for during a coding interview. Most important is the ability to write clean and correct code — it’s not enough just to be correct. A lot of people will be interacting with your code once you’re on the job, so it should be readable, maintainable, and extensible where appropriate. If your solution is clean and correct, and you produced it in a reasonable amount of time without a lot of help, you’re in good shape. But even if you stumble a bit, there are other ways to demonstrate your ability. As you work, we also watch for debugging ability, problem-solving and analytical skills, creativity, and an understanding of the ecosystem that surrounds production code.

With our evaluation criteria in mind, here are some suggestions we hope will help you perform at your very best.

Read the rest of this entry »

How to Rock an Algorithms Interview

September 26th, 2011 | Kevin Simler

Traveling salesman problem comic, originally from http://www.xkcd.com/399/
Comic courtesy of XKCD, via Creative Commons License

We do a lot of interviewing at Palantir, and let me tell you: it’s hard. I don’t mean that we ask tough questions (although we do). I mean that the task of evaluating a candidate is hard.

The problem? Given a whiteboard and one hour, determine whether the person across from you is someone you’d like to work with, in the trenches, for the next n years. A candidate’s performance during an interview is only weakly correlated with his or her true potential, but we’re stuck with the problem of turning the chickenscratch on the whiteboard into an ‘aye’ or ‘nay’. Sometimes it feels like a high-stakes game of reading tea leaves. Believe me we’re doing our best, but we’re often left the nagging worry that we’re passing up brilliant people who just had a bad day or who didn’t click with a particular problem.

In an effort to improve this situation, we wanted to write up a guide that will help candidates make sense of this process, or at least the part known as an Algorithms Interview. At Palantir we ask questions that test for a lot of different skills — coding, design, systems knowledge, etc. — but one of our staple interviews is to ask you to design an algorithm to solve a particular problem.

It usually starts like this:

Given X, figure out an efficient way to do Y.

First: Make sure you understand the problem. You’re not going to lose points asking for clarifications or talking through the obvious upfront. This will also buy you time if your brain isn’t kicking in right away. Nobody expects you to solve a problem in the first 30 seconds or even the first few minutes.

Once you understand the problem, try to come up with a solution – any solution whatever. As long as it’s valid, it doesn’t matter if your solution is trivial or ugly or extremely inefficient. What matters is that you’ve made progress. This does two things: (1) it forces you to engage with the structure of the problem, priming your brain for improvements you can make later, and (2) it gives you something in the bank, which will in turn give you confidence. If you can achieve a brute force solution to a problem, you’ve cleared a major hurdle to solving it in a more efficient way.

Now comes the hard part. You’ve given an O(n^3) solution and your interviewer asks you to do it faster. You stare at the problem, but nothing’s coming to you. At this point, there are a few different moves you can make, depending on the problem at hand and your own personality. Almost all of these can help on almost any problem:

  1. Start writing on the board. This may sound obvious, but I’ve had dozens of candidates get stuck while staring at a blank wall. Maybe they’re not visual people, but still I think it’s more productive to stare at some examples of the problem than to stare at nothing. If you can think of a picture that might be relevant, draw it. If there’s a medium-sized example you can work through, go for it. (Medium-sized is better than small, because sometimes the solution to a small example won’t generalize.) Or just write down some propositions that you know to be true. Anything is better than nothing.

  2. Talk it through. And don’t worry about sounding stupid. If it makes you feel better, tell your interviewer, “I’m just going to talk out loud. Don’t hold me to any of this.” I know many people prefer to quietly contemplate a problem, but if you’re stuck, talking is one way out of it. Sometimes you’ll say something that clearly communicates to your interviewer that you understand what’s going on. Even though you might not put much stock in it, your interviewer may interrupt you to tell you to pursue that line of thinking. Whatever you do, please DON’T fish for hints. If you need a hint, be honest and ask for one.

  3. Think algorithms. Sometimes it’s useful to mull over the particulars of the problem-at-hand and hope a solution jumps out at you (this would be a bottom-up approach). But you can also think about different algorithms and ask whether each of them applies to the problem in front of you (a top-down approach). Changing your frame of reference in this way can often lead to immediate insight. Here are some algorithmic techniques that can help solve more than half the problems we ask at Palantir:
    • Sorting (plus searching / binary search)
    • Divide-and-conquer
    • Dynamic programming / memoization
    • Greediness
    • Recursion
    • Algorithms associated with a specific data structure (which brings us to our fourth suggestion…)

  4. Think data structures. Did you know that the top 10 data structures account for 99% of all data structure use in the real world? Probably not, because I just made those numbers up — but they’re in the right ballpark. Yes, on occasion we ask a problem whose optimal solution requires a Bloom filter or suffix tree, but even those problems tend to have a near-optimal solution that uses a much more mundane data structure. The data structures that are going to show up most frequently are:
    • Array
    • Stack / Queue
    • Hashset / Hashmap / Hashtable / Dictionary
    • Tree / binary tree
    • Heap
    • Graph

    You should know these data structures inside and out. What are the insertion/deletion/lookup characteristics? (O(log n) for a balanced binary tree, for example.) What are the common caveats? (Hashing is tricky, and usually takes O(k) time when k is the size of the object being hashed.) What algorithms tend to go along with each data structure? (Dijkstra’s for a graph.) But when you understand these data structures, sometimes the solution to a problem will pop into your mind as soon as you even think about using the right one.


  5. Think about related problems you’ve seen before and how they were solved. Chances are, the problem you’ve been presented is a problem that you’ve seen before, or at least very similar. Think about those solutions and how they can be adapted to specifics of the problem at hand. Don’t get tripped up by the form that the problem is presented – distil it down to the core task and see if matches something you’ve solved in the past.

  6. Modify the problem by breaking it up into smaller problems. Try to solve a special case or simplified version of the problem. Looking at the corner cases is a good way to bound the complexity and scope of the problem. A reduction of the problem into a subset of the larger problem can give a base to start from and then work your way up to the full scope at hand.

    Looking at the problem as a composition of smaller problems may also be helpful. For example, “find a number in a sorted array which has been shifted cyclically by an unknown constant k” can be solved by (1) first figuring out “k” and then (2) figuring out how to perform binary search on a shifted array).


  7. Don’t be afraid to backtrack. If you feel like a particular approach isn’t working, it might be time to try a different approach. Of course you shouldn’t give up too easily. But if you’ve spent a few minutes on an approach that isn’t bearing any fruit and doesn’t feel promising, back up and try something else. I’ve seen more candidates who overcommit than undercommit, which means you should (all else equal) be a little more willing to abandon an unpromising approach.

Incidentally, trying out a few different approaches (rather than sticking with a single approach) tends to work well in interviews, because the problems we choose for an interview usually have many different solutions. Happily, the same is true for the problems we solve on the job =)

Help! Is there a doctor in the network???

July 23rd, 2010 | Ari Gesher

Cyber security is a hot topic, especially in national security circles. The world has witnessed a number of high-profile incidents in the past two years that have been notable for sharing three very important aspects:

  • they were targeted attacks, carried out against specific institutions
  • they were politically motivated, and, inconclusively, appear to be state-sponsored
  • they used multiple-step, multi-vectors attacks and managed to evade existing security countermeasures

This deviates from the types of attacks that IT-centric approaches have sought to defend networks against. Traditional approaches neutralize the perceived threats against a network with a host of countermeasures: firewalls, malware scanners, automated network vulnerability scanning, patch policies, and intrusion detection systems. The network defenses can learn new tricks when the administrators update the signatures, or, for certain types of data, employ a Bayesian inference strategy (as has been employed to fight spam). This approach does a good job of protecting against untargeted attacks as well as weak targeted attacks.

Full network defense requires human analysts looking at anomalies at a level above the automated countermeasures. Check out the rest of this post to take a look at how human-driven, computer-aided analysis is a game changer in cyber security.

Read the rest of this entry »

A rigorous friction model for human-computer symbiosis

June 2nd, 2010 | Asher Sinensky

This is a response to Ari’s awesome post on human-computer symbiosis. Ari and I were chatting about the equation he developed and I was wondering if there were some further refinements that are possible… let’s take a look:

We are attempting to understand the total analytic capability for a given task a of a human-computer team. Analytic capability in this case probably means:

eq1(1)

Where A is the answer to the analytic problem in question and tA is the time needed to arrive at the answer based on the inputs available. In the case of chess, A could be the optimum next move given all previous information and tA would be how long it takes to decide on this move.

Read on for a look at how this generalizes in human-computer symbiotic systems.
Read the rest of this entry »

Haiti: effective recovery through analysis

April 5th, 2010 | Ari Gesher

[Editor's Note: an edited version of this post first appeared on O'Reilly's Radar blog.]

The prologue was an earthquake of unexpected magnitude and location that left 250,000 dead.

As computer scientists and technologists, we’re used to dealing with large numbers in the abstract. Expressed in human terms, the mind-boggling numbers of 250,000 dead, 300,000 injured and over 1 million people left homeless are hard to comprehend.

Hit the link to read more about how effective data management and analysis is crucial to recovery efforts and see specific examples of data about the situation in Haiti modeled in Palantir Government.
Read the rest of this entry »

Friction in Human-Computer Symbiosis: Kasparov on Chess

March 8th, 2010 | Ari Gesher

As we build our platforms and applications following a human-computer symbiosis approach, we keep an ear to the ground for interesting examples that illuminate new techniques or validate our approach in some empirical way.

One of the areas that we’re interested is in the overall friction of analysis systems. The systems that we build are built on commodity hardware — we’re not building faster computers and yet we can deliver orders-of-magnitude better performance on analysis tasks than existing solutions. How do we do this? By building software in such a way that it reduces the friction experienced at the boundaries between the computing power, the analyst, and the source data.

Chess as analysis laboratory

Chess is, at its heart, a predictive venture. The player attempts to anticipate their opponent’s moves, planning their own moves accordingly, with the straightforward goal of finding a sequence of piece moves that force checkmate.

This game is, in its ideal form, analysis. (The moves made are the logical extension of the analysis.) The data are clean, the problem is well-defined and everyone plays by the same rules. There are even well-defined metrics for ranking chess players by skill — a better chess player is a better chess-game analyst.

In the realm of evaluation of analysis systems, this is as about as good as it gets in terms of designing controlled experiments to study the relative strengths of different analysis systems.

Garry Kasparov, widely considered to be the greatest chess player of all time, recently wrote a review of Diego Rasskin Gutman’s book, Chess Metaphors: Artificial Intelligence and the Human Mind.

The review is excellent and covers a lot of ground. However, one particular anecdote stood out as a very interesting example of human-computer symbiosis (emphasis added):

In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. Normally, “anti-cheating” algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less “intelligent” than the playing programs they detect.)

Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.

The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

After the jump, we look at this finding in a more generalized way and map it onto the Palantir approach.
Read the rest of this entry »

Palantir: like an operating system for data analysis

November 6th, 2009 | Ari Gesher

If you’ve taken the time to peruse the Palantir Government analysis blog, you’ve seen numerous examples of Palantir Government as applied to interesting problems; they are recorded screen captures of our analysis desktop client. It’s a showcase of useful, meaningful, and compelling visual and semantic tools being used to do analysis on a wide range of datasets.

What enabled this analysis? Aside from the obvious hard work of our UI and analysis tools teams, it’s the flexibility and power of the Palantir data platform. More than just a scalable datastore, the Palantir data platforms act as robust and clean abstractions on top of data.

One of the early architecture decisions that we made when building both Palantir Government and Palantir Finance was to separate the respective data platforms from the end-user applications used to actually perform analysis. More than just following the client-server model, this separation made the data servers in both products into generic intelligence infrastructure for analytic problems, with our clients acting as analysis applications on top of those platforms.

And so, one way to look at our data platform is as an operating system for analytic applications. In this post we’ll explore the history of operating systems, understand why they’re so important and see how the Palantir data servers deliver the same potential to revolutionize the writing of analysis software that operating systems did to the writing of general programs for computers.

Read the rest of this entry »


Palantir