Using HKL3000 for Data Collection at Princeton University

This is an obsolete document referring to our retired RuH3R/Xenocs/RAXIS-IV++ system - here for historical record - use this hkl3k document for the 007HF/Varimax/AFC11/Pilatus 300K system.

As part of the controller upgrade to our Raxis IV++ area detector we have a new Linux PC/Framegrabber PC pair that controls data collection. The controller and data collection computer(s) have been replaced. Despite the quite eventful hardware side to the install (it helps if the manufacturer tells the install engineer everything he needs to know) the machine works perfectly well for screening and data collection. As a user you do not have to touch the Windows Framegrabber PC and you will do all your data collection control via the Linux PC. The Framegrabber talks directly to the X-ray machine and the Linux PC talks to the Framegrabber. There's only one keyboard and mouse and the login should always be set to the Linux box. Username and password are currently posted on the CPU box.

Caveat Emptor

This guide to using HKL3000 on our system was originally written with snark set to 11. I own more Macs than I know what to do with (I'm typing this on one) so I adore carefully designed intuitive graphical interfaces. I abhor badly designed ones with equal fervor. The original version of HKL3000 that came with my eventful hardware upgrade turned out to be badly tested junk with several serious issues. It was more like a late alpha software revision. This was a version of Jan 2012. Thankfully things have changed for the better. I've received two updates - March 2012 and December 2012 - and I've started to modify this document in response to bug fixes obtained from Rigaku. By December 2012 I only curse at the monitor infrequently.

Nevertheless you will find places where some of the initial rant still remains.

This page is out of date. I'll be posting an update soon. Rigaku have been quite responsive to my original rant about this software and the two updates I've received (Mar 2012 and Dec 2012), were each an improvement over the prior version. The most recent update fixed: more obvious visual notification of the shutter being open (and an abort button); use of D*Trek's dtdisplay program so that the current data frame is automatically displayed during data collection. The most egregious issues have been fixed and it collects data quite readily. Although I'm not in love with this software it is very reassuring that Rigaku have been willing to address the major issues that I (and doubtless others) originally identified and provide me with updates. It's quite usable for data collection and processing on my venerable single axis system. I don't use it for anything downstream of that.

HKL3000 is part of the Brave New World of crystallographic computing, and anyone that's used HKL2000 will recognise it instantly. I'm not sure that I like that Brave New World, since I feel that the introduction of the GUI often comes at the price of dumbing down the interface. HKL3000 wants to be the central part of your entire structure determination process. I just want it to collect (and perhaps process) my data and then get out of my way. It's now passable at the former, and it's as good as HKL2000 on the data processing front since it basically is HKL2000. Now that the worst glitches of the Jan and Mar 2012 versions appear addressed things are a lot more usable. To provide more context for the intemperate rant, I've left the original bugs in the texted but deleted out where appropriate. Existing bugs are labeled WARNING: BUG. Some of these might actually be features, but they're pretty undesirable ones. I originally wrote: they might want to deliver this software with a [software author name currently redacted] voodoo doll so you can exact your revenge. Entertaining as that initial thought was, it's not really a justified sentiment any more since the program rarely screws up your data collection now.

While I'm not going to recommend this software unreservedly it's a lot more functional than it was a year ago. You should evaluate its capabilities carefully, preferably on a working system rather than a demo one, before upgrading to it. If you're local to Princeton I can give you a walk-through, but Rigaku can probably connect you with something shinier and newer.

Controlling the Machine

As mentioned above, the Linux PC is effectively controlling the X-ray machine but it does so via an intermediary PC framegrabber computer. Only rarely will you be aware that the latter exists, but if there's a power outage and things have to be restarted both obviously have to be functioning.

The Linux control PC does not have the user accounts that are present on the other machines. This is done deliberately to keep the raw data collection instrumentation away from the data wrangling computers. The /data directory on the Linux control PC (otherwise called mol-xray1) is mounted on the Linux boxes. If you have a Mac with a fixed IP address I can probably arrange to mount it on there too. Because xray1 is on the network like all our other machines, logins from non-Princeton IPs are specifically disabled. If you need to log in from home to check data collection we need to figure out what range of IP addresses you are using and enable that, or log in via one of the few outside-login-enabled workstations and connect from those.

Perhaps the largest practical difference in the workflow with this software is that the initial Connect step datum's the Phi axis, and therefore you need to have the Phi axis motor locked down or it will fail to initialize. This wasn't something that you needed to do with our previous software (Crystal Clear), where the Connect step was not present. It's a slightly different mode of operation.

Phil's Running Bug/Feature List (Jan 2012 through Feb 2013)

So you don't have to read all of my rambling to figure out bugs and misfeatures:

Using the Graphical User Interface

Cropped image of Desktop




Upon login, the default Desktop has icons for the software on the left hand side. There are two gaudy diamond icons on the left labeled "HKL3000R" and "HKL3000R Raxis IV". You want to use the HKL3000R Raxis IV to control the machine. The HKL3000R version of the program can be used to analyse data you've already collected and cannot control the machine.

You do not want to use the X-ray Generator Control - I've now moved this away from the other program icons. If you want to use Rigaku's collection strategy algorithm instead of the one internal to HKL3000 then that's the icon next to the HKL3000R Raxis IV program, but that won't allow you to collect data by itself. Rigaku's data strategy program was last updated December 2012 along with the rest of the control software.

On the top of the window is the icon bar which you can use to open a Terminal if you want to use standard Unix commands (e.g. for looking in your home directory or managing data). That's the icon right next to the System menu. The web browser (Firefox) is the icon next to that. Since this is a data collection/processing machine, do not install any additional software on this system without permission.

Project tab




Start the software by clicking on the HKL3000R Raxis IV. If the program is currently open you should quit the program and re-open it so you don't accidentally mess with other people's data. Underneath the top menu bar is a series of tabs labeled Project, Collect, Data, Summary, Index, etc. Click on each tab to reach the embedded info for each. If you're screening crystals you'll just use Project and Collect. If you're collecting data or want to auto-index your data you will use more of the tabs further to the right. If you've used HKL2000 then the general layout is familiar.

HKL3000 has the same Project and Sample framework that the old Rigaku software had. Generally speaking you keep images for each protein under a distinct project name, with different crystals being a new sample within a project. Click on the Project Tab to manage projects and samples (shown at left). You can create new projects, save existing ones, load old ones. When HKL3000 starts it defaults to a generic project name - it doesn't save your old project info so you should remember to save it manually before quitting and load it after starting.

It's possible to collect data and process it without changing the default project name. Just make sure you specify a separate and unique data directory name for your data and processing directories. I generally recommend against doing this if you're actually collecting data since it makes more sense to cluster related datasets into one project. Also, NEVER modify someone else's project in this way.

At the top of the HKL3000 window you'll see a "Site Configuration" menu. This is where the current direct beam coordinates are stored, and also allows the system admin (i.e. me) to enter the default masked background regions. You'll find that you can't save new values of this without a password - but if you believe the parameters are set wrong you can get me to change them to more reasonable values. We have determined refined values based on a rather nice tetragonal Lysozyme dataset and loaded them into here. The actual crystal-detector distance refined to within 0.5mm of the reported distance, so that's more than accurate enough for auto-indexing.

New Project window




Here's what you see if you click on the New Project button. Because HKL3000 has delusions of grandeur in being an overall framework for structure solution, it is far more pedantic about projects than the old CrystalClear software - it demands to know your project sequence when you add a new project. We don't plan on using this framework beyond data collection and (optionally) processing. Feel free to type in any old garbage in the sequence box there unless you anticipate trying to solve a structure by SAD on a home source - during the Rigaku demo we were shown how you can do this using S-SAD from tetragonal HEWL crystals, but your crystal probably doesn't diffract like chicken Lysozyme. Most of the options on this screen are a waste of time for the simple purpose of crystal screening. If you don't click on Molecular Replacement it will force you to specify the anomalous scatterers too. The string GNARLY is a valid protein sequence, incidentally.

When you press the "Connect" button in the Collect tab (see below) HKL3000 initializes the machine. If you've forgotten to lock down the Phi axis the program will eventually generate an error and Connect will fail since the Phi axis cannot datum. You must not lock down the Phi axis when the motor is in motion, so if you discover that the Phi axis motor is spinning but the Phi axis is not moving: FIRST quit the program; SECOND check that the phi axis motor has stopped spinning; IF both are true then lock down the Phi axis and restart HKL3000. Remember to waggle Phi a little when you lock it down to make sure the gear and the worm screw are meshed. If you hear the Phi axis make a clunk sound when it starts to spin the gears were not meshed. Tssk tssk tssk.

Bug fix: the bug related to the not datuming phi before data collection is now fixed, so you don't have to be hyper-cautious when switching crystals.

WARNING: BUG/SAFETY ISSUE - THERE IS NO ACTIVE INDICATION ON THE SOFTWARE THAT THE SHUTTER IS OPEN. The December 2012 software update now provides visual cues that the shutter is open and a small separate window with an Abort button to make it easy to quickly close the shutter. As ever, the yellow shutter lamp on the "light tower" on the generator will tell you shutter status (lit=open), and with the enclosure door open the shutter cannot open no matter what the software wants to happen. Always keep the enclosure door open when working with the generator and close it only when you are outside the enclosure and collecting data. However this is a really stupid software feature that needs to be remedied (by Rigaku/HKL) as soon as possible. This may actually break some state laws. It would be actively dangerous in the facility I used to manage in NYC, for example, particularly because it's not clear if an instrument is about to open the shutter or not.

Collect window
Collect is by far the most important tab. Once you've defined Project and Sample you should mount your crystal and click on the Collect Tab which brings up what you see at left. The buttons on the left are the important ones. You need to press the Connect button if this is the first crystal you have shot since starting the program - this starts up the server on the framegrabber PC which is necessary for machine communication. It also initializes the detector and phi axis when it does so. LOCK DOWN THE PHI AXIS BEFORE CLICKING ON CONNECT.

Crystal Check loads default parameters for the sort of images we shoot to test crystals. Data Collection are the sort of parameters you'd use if you actually want to collect an entire dataset. For Crystal Check you should set distance (move the detector manually and enter the new distance - the software cannot move the detector), number of frames, phi start, frame width (degrees, nearly always 1.0), exposure time (SECONDS not MINUTES) and click the toggle if you want to shoot orthogonal pairs of images separated by 90 degrees. Default exposure time is 10 seconds and that's waaaaaaaaaaay too short for our system (try 300 or 600 seconds). Once you've changed the values the blue button at left marked Collect Sets will be enabled and you can collect the images. Since the wavelength and detector geometry are pre-defined you'd hope that HKL3000 could fill out that resolution vs distance table at the top of the tab right from the start, but it apparently needs you to do something before it will do that.

For actual data collection use the Data Collection button which has many of the same parameters as Crystal Check. WARNING: BUG: the data collection display that monitors data collection only shows frame 1 even if you are collecting 180 frames. The current frame is now displayed using the program dtdisplay, part of Jim Pflugrath's D*Trek program. This has slightly different GUI behavior than HKL's Xdisp program but runs independently of it. WARNING: BUG the program does not check for enough available disk space in /data (primary data) or /home (indexing etc). Since the framegrabber doesn't check to see if there is space on the Linux PC before deleting frames, you can lose data if you don't have enough disk space. Do a "df -k /data" in a terminal window to check the amount of space available in /data - it's a 200 Gb partition with 18 Mb frames so even if it's 95% full you've got some space. If it is 95% full tell me to clear up some old data.

Extra features on this tab are: "Status" button at the lower left is the most useful - HKL3000 has really bad notification of what the machine is doing, on the bottom of the machine (the Progress bar is particularly useless) but the Status window that comes up when you hit the button is marginally better; "Align" sub-tab - a red herring since we don't have a video alignment microscope - gives an error if you click it but no damage done; "Manual" sub-tab which really only allows you to move the phi axis (you have to move the detector yourself); "Multi" button on the right hand side which lets you set up multiple data collection runs - not a lot of use for a home source.

More tabs to come later, although those of you who have used HKL2000 will be familiar with Data (apparently loaded automatically from the Collect tab) and Index/Integrate/Scale. Strategy uses the stock HKL2000 strategy method, but Rigaku have one of their own which they consider to be better that runs as a separate program.

Most of the steps in data processing are analogous to my HKL Data Processing Guide which was written in 2002, prior to the era in which y'all have been reduced to button clicking for a living. The basic programs in HKL still underlie how HKL2000 and HKL3000 work - they just write scripts to run these programs via the command line and parse the log files. Since I'm both a Luddite and a Curmudgeon you'll frequently find me muttering while reading the actual logs rather than the pretty little graphs.

Mini Rant - HKL3000 is not an Expert System

There's a pretty clear distinction between programs that are just Graphical User Interfaces (GUIs) and ones that are Expert Systems. Examples of the former are CCP4i which is really just a series of GUI for the underlying programs. GUIs often make running the programs simpler because you don't have to memorize all of the arcane syntax but they make few meaningful decisions for you. If you've ever used SFALL you're probably grateful not to have to memorize its syntax, although a lot of calculations I personally do are using script files that I've accumulated over the years. I use HKL2MAP at synchrotrons to run SHELXC/D/E because it's a lot faster than hacking the shell scripts that I otherwise used to use for those programs. At the other end of the spectrum are programs that try and make intelligent choices for you: autoProc being a pretty good example, as is James Holton's Elves. Phenix.xtriage is pretty good at figuring out what it needs to know for itself, too, although it's not really a data manipulation program - just a screening one. While some programs might lurk in the gray area between these two poles, HKL3000 and its predecessor HKL2000 are fundementally just GUIs. They do not make smart choices for the user despite being in a position to do so. They're just substituting button pushing for writing and using script files.

Examples:

Since I now work 20% of the time over in the Chemistry Dept on a small molecule single crystal diffraction instrument, I can attest to the fact that Bruker's Apex2 data program does all of the things listed above in a largely automated fashion. Automatic spot size, suggested exposure times with resolution cutoffs, accurate elapsed times for data collections spanning thousands of frames, automatic integrations in straightforward cases, mostly automated space group determinations and data collection strategies. Although small molecule data is strong and frequently easier to wrangle, these things are not anything that couldn't be implemented for protein data collection in a truly integrated environment.

I could go on all day coming up with other examples but to me it's pretty clear that HKL3000 offers no significant advantages for data wrangling downstream of data collection (and perhaps processing, although also try MOSFLM, XDS, autoPROC or Xia2) and you'd do better to stick to CCP4i or Phenix GUIs if you like that kind of thing. Or in fact the command line if you're what I once referred to as the "die-hard VT100 user". I'm in that latter category - I'm actually old enough to remember what a VT100 is although I was quite a lot fonder of the VT220. (At any time in this paragraph you may choose to invoke the Monty Python Four Yorkshiremen sketch) I'm writing this using emacs over a ssh connection. I've met one crystallographer who runs refinements using ssh connections via his iPhone. I'm pretty impressed by that, although I'm perfectly willing to be impressed by thoughtful software design too.

Phil Jeffrey, Princeton, last modified March 29th 2012.