Archive for the ‘development process’ Category

Software Dev Intern Projects – 2011

February 20th, 2012 | Ari Gesher

Palantir Technologies Interns, 2011

As we roll into the peak of internship season, it seems like a worthwhile time to talk about just what it is that interns do in their time at Palantir: our software engineering interns are full members of the development team from the day they arrive. During their time with us, they design, implement, and test their projects while alongside the full time engineers.

We believe in hitting the ground running; in fact, before you get your badge on the first day, you must commit code. But don’t worry, this all takes place under the watchful eyes of your mentor: a full-time engineer who’s there to guide through everything from our development environment, how to write specifications, to software architecture, to how to figure out which cafe has the best lunch.

You can broadly break down the projects the Palantir interns worked on into a number of categories:

Join Us In 2012!

We’re currently hiring for our 2012 intern class as well as full time positions. For those interested in working on the sorts of projects mentioned here, you’ll want to apply for this job:

For other internships, check out our open positions page.

From those from non-traditional schools that have co-op programs that run during the year, we offer those as well. Go ahead and apply for the appropriate intern position and make a note of when you’d like your co-op to be in your application.

Read on to dive into a large sampling of software engineering projects our interns worked on this past summer.
Read the rest of this entry »

How to Rock a Systems Design Interview

October 28th, 2011 | John Carrino

Compiler design dependency comic, originally from http://www.xkcd.com/754/
Comic courtesy of XKCD, via Creative Commons License

Note: this third installment in our series on doing your best in interviews. Previously: “How to Rock an Algorithms Interview” and “The Coding Interview”.

One interview that candidates often struggle with is the systems design interview. Even if you know your algorithms and write clean code, that code needs to run on a computer somewhere — and then things quickly get complicated. A truly unbelievable amount of complexity lies beneath something as simple as visiting Google in your browser. While most of that complexity is abstracted away from the end user, as a system designer you have to face it head on, and the more you can handle, the better.

At Palantir, many of our teams give a systems design interview along with an algorithms interview and a couple of coding interviews. We don’t expect anyone to be an expert at all three disciplines (although some are). We’re looking for generalists with depth — people who are good at most things, and great at some. If systems design isn’t your strength, that’s okay, but you should at least be able to talk and reason competently about a complex system.

Read on to learn about what we’re looking for and how you can prepare.

Read the rest of this entry »

Fun with jMock

November 22nd, 2009 | Steve Downing

Here at Palantir, a lot of our automatic tests are full-chain tests. A backend server is fired up, client code runs against it, and everything runs much like a production environment. This makes intuitive sense because it’s a faithful approximation of how the system will run in the field.

However, there are some disadvantages to this:

  • Full-pass tests don’t always localize the problem. Tests on a client class might fail even if it was the service that behaved incorrectly.
  • These full-pass tests are relatively slow. Client code is running against an actual remote service. If a client is being tested, the server code still has to do work — sometimes a lot of work — even if that isn’t the focus of the test.
  • The constraints of the test are loose. Full-chain tests can mostly only see whether the operation finished correctly. It’s much harder to figure out whether the operation was done efficiently and without making unnecessary service calls.
  • They’re very little setup flexibility. If you want an RPC to return a specific value, you have little choice but to have your test get the service into a state where it can return that value. This is easy in some cases, but prohibitively difficult in others.
  • Client tests are forced to share any non-determinism leaked from the service. For example, under real conditions, a request to call A might respond before call B, and sometimes the other way around. This can result in flaky tests or tests that don’t always simulate the conditions you want to exercise.

What’s to be done? Fortunately, there’s an option that handles these cases elegantly. We also test with jMock, a library that dynamically generates mock objects from arbitrary interfaces. These mock objects can be configured to check that particular methods are called with particular inputs a particular number of times, and then give prescribed responses.

Hit the link to see a concrete example of jMock in action.
Read the rest of this entry »

Palantir Build Contraption

October 22nd, 2008 | Jon Kean

Here at Palantir, we use continuous integration as one of our development practices. Part of this includes running automated builds and tests. This practice is quite common, and is useful because it gives immediate feedback on (some of) the software’s correctness.

How it usually works

  • Developers submit changes to the code base.
  • The continuous build system detects a problem and the build “breaks” (or “goes red”).
  • All developers who made recent changes are notified via email.
  • Someone (usually the guilty developer) locates and fixes the problem.
  • The continuous build system verifies the corrected problem and “goes green”.

Ideally, the delay between failure and fix should be as short as possible. But failure notifications are typically sent by email, which can easily be overlooked or ignored. Clearly we can do better.
Read the rest of this entry »

Deploying a distributed system

October 7th, 2008 | Bob McGrew

Distributed systems diagram

At Palantir, we write software that gets deployed at each client, integrated across their sensitive data sets, and maintained and administered by that client’s in-house admins. Most deployed enterprise software is run on a single beefy box: consider wikis, blogging systems, bug tracking systems, or practically any client/server or web client software software used today. On the other hand, most enterprise software that runs as a distributed system is hosted: Salesforce.com, Google Apps, or any approach that sells software as a service. What’s fairly unusual about our software is that it’s deployed as a distributed system at each client.

Distributed systems are hard to build and hard to maintain. As long as that distributed system is built and maintained in-house, however, you have a number of advantages:

  • The administrators are full-time product experts who are focused on the mission of keeping your system available and responsive.
  • The development organization can build internal tools for the administrators that only have to be “good enough” and can step in if necessary.
  • It’s easy to get feedback on how the system performs, because there are no sensitivity, privacy, or legal constraints.
  • A single, large deployment allows you to optimize your hardware purchasing and amortize installation headaches across a large number of machines.

This is all great, of course, and if you can host and maintain your distributed system yourself, I’d highly recommend it. Sometimes, however, it’s just not possible. At Palantir, the client data we work with is so sensitive that even we cannot see it, except under very strictly controlled circumstances. It’s also so large that the bandwidth limitations of pushing it into a system hosted by us would be prohibitive.

So suppose that you have to deploy your distributed system in a customer datacenter with external parties maintaining the system. What do you need to consider? In this post, I’ll go into a number of key points that we have faced and addressed at Palantir.

Read the rest of this entry »

Pipes: using unix pipelines for beautiful answers to quick and dirty questions

February 7th, 2007 | Ari Gesher

/loony/bin

As we approach a release at Palantir we usually cut to a stable branch that QA can start testing as a release candidate. Further bug fixing and testing may continue on trunk by the developers, but we code review changes before committing them to the stable branch. As the time to really cut the release gets truly imminent we start asking questions like:

What changes are on trunk that are not in the stable branch?

We’re less concerned with what the changes are and more concerned with who owns the changes. What really want to know is:

Do the changes on trunk represent pending changes that should be moved to stable or are they further development that shouldn’t be put into the stable branch for this release?

For the most part, the person that can answer that question is the coder who made the changes on trunk. To that end, what we really would love to have would be a report of all files in trunk that differ from the stable branch and who last touched the file. There isn’t really an svn command that will do this succintly, so I started thinking about how to accomplish this. I had an inkling that it could be all solved with a single Unix pipeline and so I set out on my way to craft such a beast. Here’s what I came up with in about ten minutes:

for name in `diff -r --brief --exclude=.svn pgstable/src pgtrunk/src  | awk '{print $4}' | grep pgtrunk `; do
    author=`svn info $name | grep -E "Last Changed Author" | awk '{print $4}'`;
    echo $author    $name;
done | sort | sed 's/pgtrunk\\/src\\///' > difflist.txt

Which produces output that looks like this:

gbush com/palantir/foo/Bar.java
bclinton com/palantir/baz/Fargle.java

How did I come up with such a beast? I deconstruct this inscrutable wonder after the jump.
Read the rest of this entry »


Palantir