Wednesday, October 28, 2009

JVM on Xen, from Sun

This could be neat.

Sun has a project where they are trying to run a JVM directly on a hypervisor, without the normal OS in-between.

From their project overview:

Project Guest VM is an implementation of the Java platform in Java and hosted directly on the Xen hypervisor, that is, without the traditional operating system layer. It is based on the Maxine Virtual Machine which is itself written in Java. The result is a Java platform software stack that is all Java except for a very thin microkernel layer that interfaces to Xen.

Briefly, the goals for Guest VM are:

* Exploit access to low-level features, particularly memory management and thread scheduling.

* Enable specialization and optimization of the entire software stack.

* Simplify administration by extending the Java platform to replace the OS.

* Increase developer productivity through the use of modern IDEs.

More information, including instructions for getting the source code, can be found at http://research.sun.com/projects/guestvm.


If it is a single JVM per dom-U, then this could make resource provisioning a snap.

IronScheme Hits RC1

Given my rediscovered enjoyment of parenthesis-based languages, I noted that IronScheme, an R6RS implementation on MS' CLR, has hit RC1.

Although, really, if I were to choose a functional language for the CLR, it would be F#. Fully supported by MS, decent IDE support...unless there is something particularly compelling about IronScheme, F# would be the one to choose.

Tuesday, October 27, 2009

Eclipse and Clojure Unit Testing

So far, I'm enjoying my little Clojure projects. The biggest weakness is the IDE, NetBeans, specifically. I haven't and probably won't bother with the emacs version. I'm sure there's things about it that work better, but I'm way too spoiled.

To be fair, the NetBeans plugin is in alpha state. It should get better - this is just a snapshot at this point in time.

As I've worked on this intermediate app, I've had occasion to go back and modify some of the java code I'd already written. Given all the refactoring and testing handholding which NetBeans provides, making those changes was pretty easy.

Here's the world in which I'm living: To get unit tests even somewhat integrated, I call and organize them from main.

(ns some-ns.main
(:gen-class)
(:use
clojure.contrib.test-is))

(defn -main []
(run-tests
'some-ns.someclass
'some-ns.otherclass
))

Because of the IDE integration differences, I can't say that I'm more productive in Clojure. A huge difference is highlighting syntax errors. In Java, they pop up immediately and are easily dealt with. In Clojure, I have to compile before I find out I fat-fingered something.

Debugging is similar. No breakpoints and horrific stack-traces. A good chunk of my code is devoted to logging (which is good and all, but c'mon).

I'm sure things will improve over time. Like I said, the plug-in is alpha. If the language gains much momentum, I expect to see improvements. I have the feeling that it won't gain enough to compare to the ease of Java (I can't believe I just typed that) overall.

Overall, I'm enjoying the experience. I am accomplishing a lot, pretty quickly, and it seems like there is a lot less jumping around trying to keep all the different parts working together.

Friday, October 16, 2009

Beginning Clojure Macros - Part 2

Last time, we went over what macros were, and their parts. I also gave a pretty poor example of how to use them. I'll be taking that same code and being a little more efficient about it.

(defn parse-ip [ip current]
(if
;;; probably should use the network lib to be as accurate as possible...
;;; but I won't
(re-matches #"[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" ip)
(inc-summary-counter current ip)
current))

(defn parse-datetime [ datetime current ]
(if
;;; since we're splitting on spaces, we get back "[19/Jan/2006:04:30:26"
(re-matches #"\[.{20}" datetime)
(inc-summary-counter current (subs datetime 1 12))
current))
There's a lot of repetition there. It's a simple if statement, but it will do for this example.


(defmacro parse-field [ valid-expr update-expr skip-expr ]
`(if
~valid-expr
~update-expr
~skip-expr))

Awesome! We just re-created if! Let's use it...


(let [[chash ctext] (get-field results :path-summary 6 nline)]
(parse-field
(re-matches #"/.+" ctext)
(inc-summary-counter chash ctext)
chash))

At least that's more self-explantory than a commented if-statement. Let's sort-of expand this.


(let [[chash ctext] (get-field results :path-summary 6 nline)]
;;; replacement of "parse-field" begins here
(if
;;; ~valid-expr
(re-matches #"/.+" ctext)
;;; ~update-expr
(inc-summary-counter chash ctext)
;;; ~skip-expr
chash))

Yep. That's it. Effectively, the only thing that changed was parse-field into if, right? Yes, but rather than evaluate the expressions, then pass them, we're passing the expressions themselves. They don't get evaluated for their values until they are required. Clojure already does things quite lazily, and caches a lot, but it can't anticipate everything, such as a long-running database call. It sometimes is just better to build in assurances yourself, anyway.

Why not a function?


Another way to achieve the same thing would've been to pass anonymous functions to another function. This would achieve the deferral of expensive evaluations, encapsulation, expressiveness, etc., as well, thanks to the power of closures:

(defn parse-field-fn [ current-hash current-text valid-fn update-fn ]
(if
(valid-fn current-text)
(update-fn current-hash current-text)
current-hash))

(let [[chash ctext] (get-field results :path-summary 6 nline)]
(parse-field-fn
chash
#(re-matches #"/.+" ctext )
#(inc-summary-counter chash ctext)))


When should I use macros instead of functions?


You should look for opportunities any time you're writing a function to replace structure. I only used a single if statement, but it could have had several. Another good opportunity is to replace a complicated function. If you have multiple nested if statements, conds, whatever, the main code can be easier to read by using a macro.

In all, though, learning macros - and any metaprogramming techniques - is worthwhile. It gives you greater control assembling the various bits and pieces which make up the program.

Beginning Clojure Macros - Part 1

I finally got around to learning the macro system in Clojure. When I'm picking up a new language, one of the first things I do is write an app which parses log files. I keep some around, just for that. There's a lot on the web about learning Clojure, but not as much about its macro system, not to mention macros, in general.

What's a macro?


According to this tutorial on which I've been leaning heavily:
Macros are used to add new constructs to the language. They are code that generates code at read-time.

To further oversimplify, macros are some really smart text substitution. Sort of.

You use macros to re-use code structure. You probably already do this with generic algorithms as part of your abstraction, but macros let you generate those algorithms dynamically. Applied appropriately, they result in cleaner code.

What is macro expansion?


When the compiler takes your source code, the first thing it does is look for macros. It "expands" the macro into the defmacro, replacing the text of one for the other.

Consider the text "The quick brown fox jumps over the lazy dog". There's a structure to that sentence, "The jumps over the lazy dog" (and more that we could abstract away, but we're keeping it simple here). If our macro is "", and we've defined it as "gazelle", then the resulting sentence would be "The gazelle jumps over the lazy dog".

Simple enough, right?

An actual code example



Log files have a bunch of different fields, and they all have their little peculiarities to deal with. To that end, I've got a couple of filter-ish functions: parse-ip and parse-datetime.

The two share a lot of similarities. They both check to see if the data is valid, and if it is, then update the current data and return the results, otherwise return the original results. Inside of the update, we update the map, either creating a new entry with a value of "1", or incrementing the existing entry.

I've added a bunch of comments to help explain the various pieces of a macro.

(defn parse-ip [ip current]
;;; accepts the IP address field as a string
;;; the current hashmap is updated and returned

;;; is it actually an ip address?
(if
;;; probably should use the network lib to be as accurate as possible...
;;; but I won't
(re-matches #"[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" ip)

;;; increment the count
(assoc current ip
(if (contains? current ip)
(+ 1 (get current ip))
1 ))

;;; not an ip address
current))

(defn parse-datetime [ datetime current ]
(if
;;; since we're splitting on spaces, we get back "[19/Jan/2006:04:30:26"
(re-matches #"\[.{20}" datetime)

(let [ dt (subs datetime 1 12) ] ;;;(subs datetime 1 12)]

(assoc current dt
(if (contains? current dt)
(+ 1 (get current dt))
1 )))
current))


We'll take care of the hash update part, first, and replace that whole assoc form with something else.

;;; our macro takes two parameters, like a function
(defmacro inc-summary-counter [ hset nkey ]
;;; see the little "`" at the beginning?
;;; that means "don't evaluate anything, just return the text"
`(let
;;; there's two "weird" things in this let form
;;; the "#" after the first "hset"
;;; and the "~" before the second "hset"
;;;
;;; the suffix "#" means "generate a symbol for this",
;;; essentially a name which is guaranteed unique to this macro expansion
;;;
;;; the prefix "~" means "expand this passed variable"
;;; "~hset" will be replaced with the text of "hset"
;;;
;;; the reason for this particular little trick is to make sure that "hset"
;;; is evaluated only once, and it's result kept in a binding
[ hset# ~hset
nkey# ~nkey ]

;;; all of this will be returned as-is
(assoc hset# nkey#
(if (contains? hset# nkey#)
(+ 1 (get hset# nkey#))
1))))


Well, that was a lot to "save time", wasn't it? Here's our updated filter code, minus but using inc-summary-counter:

(defn parse-ip [ip current]
(if
;;; probably should use the network lib to be as accurate as possible...
;;; but I won't
(re-matches #"[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" ip)
(inc-summary-counter current ip)
current))

(defn parse-datetime [ datetime current ]
(if
;;; since we're splitting on spaces, we get back "[19/Jan/2006:04:30:26"
(re-matches #"\[.{20}" datetime)
(inc-summary-counter current (subs datetime 1 12))
current))

Much prettier, wouldn't you say? Nothing you couldn't do with a function, but it is what is happening that is important.

Everywhere you see inc-summary-counter and parameters, that expression is replaced with it's defmacro.

Here's parse-ip again, with the macro pseudo-expanded.

(defn parse-ip [ip current]
(if
;;; probably should use the network lib to be as accurate as possible...
;;; but I won't
(re-matches #"[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" ip)
;;; inc-summary-counter was here
(let
;;; the ~hset and ~nkey have been replaced
[ hset# current
nkey# ip ]
(assoc hset# nkey#
(if (contains? hset# nkey#)
(+ 1 (get hset# nkey#))
1))))
current))

Remember that macro-expansion happens before compilation? So, after the macros are expanded, the above code is what is passed to the compiler. I'll have a more powerful example next time.

Wednesday, October 14, 2009

The Four Quadrants of Technical Debt

Martin Fowler has a piece breaking down "Technical Debt", i.e., shortcuts you take now will have to be "paid back" in the future.

The argument was made that some Technical Debt was not only inevitable, it was desirable. If taking on that debt meant making a ship date, then that debt was worthwhile.

The debt metaphor reminds us about the choices we can make with design flaws. The prudent debt to reach a release may not be worth paying down if the interest payments are sufficiently small - such as if it were in a rarely touched part of the code-base.

He also provides a nice graphic breakdown, worth reading. Check it out.

Monday, October 12, 2009

Using RUNAS for SQL Management Studio

Ran across this today, and just don't want to lose it.

The short version: if you need to connect to a Windows SQL server in a different domain, the runas command has a /netonly switch.

runas /netonly /user:domain\username “C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\Ssms.exe”

Neat, huh?

Friday, October 9, 2009

Exploding Software Myths - MS Research

Over the last few years, Microsoft has been putting real study into development processes and techniques. Which makes sense, since they've got enough development teams to be able to do (mostly) controlled experiments.

Some of their findings:
  • TDD: "code that was 60 to 90 percent better in terms of defect density...took longer to complete their projects—15 to 35 percent longer"
  • Assertions: "definite negative correlation: more assertions and code verifications means fewer bugs. Looking behind the straight statistical evidence, they also found a contextual variable: experience."
  • Organizational Structure: "Organizational metrics, which are not related to the code, can predict software failure-proneness with a precision and recall of 85 percent"
  • Remote workers: "the differences were statistically negligible" for distributed development"

“I feel that we’ve closed the loop,” Nagappan says. “It started with Conway’s
Law, which Brooks cited in The Mythical Man-Month; now, we can show that,
yes, the design of the organization building the software system is as crucial
as the system itself.”

Awesome.

Thursday, October 8, 2009

Considering Clojure

I've been looking at clojure for awhile, now. I liked lisp, back in the day, but never got particularly good at it. Since then, I've done some minor projects in Scheme (Chicken scheme, to be specific). The syntax and programming styles really "did it" for me. Problems just seemed simpler to solve.

When I was working on an earlier revision of this project in .NET, I passed up on F#. I used it for some small test apps, liked it a lot, but decided against it. The main reason is that nobody else is using it. This might change with its inclusion in VS.NET 2010, but I'm not going to hold my breath. Besides, C# has a lot of functional-like syntax these days, so while I may miss some of the F# sugar (pattern matching, for example), I don't feel all that hemmed in with C#.

This next chunk of code I have to write for my little indexing project is discrete from the rest of the project. If clojure doesn't take off as more than an interesting niche language, I could easily find myself replacing it with bog-standard Java.

'Cause I have to admit it: Java sucks. It isn't that it is hard, it's that it is a pain in the ass. In some ways, I preferred programming in C - I spent a lot less time working, it seems. Maybe because I did so much in C; I don't know. I do know that just about every language I've tried since then (except PIC) has been less of a hassle.

I do like all the JVM application containers, though. To me, that's the real winner for Java.

Which leaves me with the whole "changing horses in midstream" problem. I really should write the whole thing in one language. You wouldn't think that would be too much to ask, would you? I certainly wouldn't hesitate to ask it of someone else.

So, I'll probably keep poking around through the tutorial. I may end up with working code which I end up using for the next bit. At the very least, I'll have a good idea as to whether or not this was a good idea.

LiquiBase - Database Version Control

If you're a fan of Ruby on Rails, then you probably know about migrations (and what a pain they can be).

Well, LiquiBase promises to bring the same benefits (and headaches) associated with Rails-style migrations. I haven't had a chance to check it out, yet, but I will update when I do.