Archive

Posts Tagged ‘Artifactory’

Artifactory Online – the case of distributing Groovy++

May 5, 2010 4 comments

After working with open source Artifactory version and thoroughly exploring it’s add-ons I knew it would come a moment to put my hands on its cloud solution – Artifactory Online. It just made sense to “close the loop” this way .. Тhe moment I’ve heard about online instance, running 24×7 without having to take care of anything – it sounded really, really nice.

I can’t say we spend a lot of time administering our open source version in Thomson Reuters. Quite the opposite – I only need to take it down for upgrades from time to time. But it still takes us a machine. A virtual one, of course but still – that’s CPU and memory that could be well spent somewhere else. So having online instance not only gives a peace of mind freeing everybody from taking care of one more server and one more database – it frees some hardware resources as well.

Well, a beauty of cloud computing, when it works. And Artifactory certainly does!

But my first use of Artifactory Online was for slightly different purpose – Maven support for the Groovy++ project. There was a clear need to host Groovy++ binaries in a public Maven repo.

What options are available today?

  • OSS repository hosting from Sonatype.
    It’s a good free solution but like any other free solution it only provides you so much: if you don’t mind being at mercy of other people with certain demands about how your POMs should look like – then it’s a good way to go. But I’d prefer my personal repository, where I can configure it the way I want without sharing it with other projects and asking favors. Also, bear in mind you get no security whatsoever – all your binaries would be open to everybody, anytime. And it only works for open-source projects which can be another showstopper.
  •  

  • Another option is to host a public Apache or nginx server and just make the files available following Maven’s naming conventions, like it’s done on "repo1". Not to mention the lack of security (again) – this kind of storage can be fragile to files corruptions: after all it’s just a dumb files storage, not an intelligent repo manager. You can’t use Maven to deploy artifacts and it provides no additional services, like virtual repos, artifacts searching or usage statistics.
  •  

  • Public hosting of open-source Nexus or Artifactory – it’s much better and we can finally protect it the way we want. But we still need to pay for hosting, memory usage, bandwidth usage and we now need to install and administer it on top of everything else. And put some extra protection, may be.
  •  

  • Artifactory Online. The best way to go, if you ask me. Not only it provides a cloud-based 24×7 running Artifactory instance, but it does so with all add-ons installed, so you really get it the ‘Full Monty’! It solves our original problem, to host binaries in a public Maven repo, but it doesn’t stop there, as I’ll show shortly – running a private Artifactory instance brings other advantages to your projects.

We’ve settled with last option, meet http://groovypp.artifactoryonline.com/!
The initial setup went very fast as there were very few things to take care of, actually:

  1. Registration
  2.  

  3. Creating deployment user and "settings.xml". The fact that Artifactory provides a way to generate new “settings.xml” (skip to 00:10:40) and store an encrypted passwords (00:11:55) comes in very handy:

    <servers>
        <server>
            <id>groovypp</id>
            <username>username</username>
            <password>\{ABAeqq\}pIcMooZ8G/2Y2drgC99SDw==</password> <!-- Sample -->
        </server>
    </servers>
    
  4. Instructing Maven about new repositories:

    <properties>
        <repo>http://groovypp.artifactoryonline.com/groovypp</repo>
    </properties>
    
    <distributionmanagement>
        <repository>
            <id>libs-releases-local</id>
            <url>${repo}/libs-releases-local</url>
        </repository>
    </distributionmanagement>
    
    <repositories>
        <repository>
            <id>libs-releases</id>
            <url>${repo}/libs-releases</url>
        </repository>
    </repositories>
    
    <pluginrepositories>
        <pluginrepository>
            <id>plugins-releases</id>
            <url>${repo}/plugins-releases</url>
        </pluginrepository>
    </pluginrepositories>
    

As you see, we’re using new repo not only for <distributionManagement> but as our only Maven repository.
From now on we only talk to groovypp.artifactoryonline.com/groovypp/libs-releases.
This is “virtual repository”, a “gateway” Maven will connect to for retrieving any 3-rd party library. I can now add additional Maven repositories by editing it in Artifactory, there’s no need to update the POM any more when new external repos are added to the project.

After this quick setup I ran it for the first time. I was expecting somewhat slower performance than the one we have in the office where Artifactory is running on the same network. After all, we’re talking here about remote repository running somewhere across the ocean:

But the download was pretty fast. It depends on the bandwidth, of course but I can’t say that significantly more distant repository has slowed me down. UI was very response and Maven’s filling of empty local repo was fast enough as not to notice any significant difference. Good!

Groovy++ project is now happily using Artifactory Online for several months and releases, you’re always welcome to download the latest version manually or give it a try with Maven.

What else can I say about running a private repo like that?

I think the main beauty of it is being able to “go public” in matter of minutes. No setups, no worries – you have your very own binaries storage, intelligent and secured that can be used for any purpose. That’s right, Artifactory can serve any binaries, not necessarily Maven’s artifacts. So one can store there practically anything and then secure or backup it safely.

Makes me think of various App Store services where people publish their Android / iPhone applications and enjoy the ride. That’s good, I believe “going public” should be easy for anyone today – this way creativity meets no entry barriers!

I only have a single request to Artifactory developers – an option to create aliases to existing repo. This would allow to reuse the same repository for different projects or purposes: http://projectA.artifactoryonline.com/ and http://projectB.artifactoryonline.com/ will point to the same Artifactory instance but will be used by different people.

Overall, a very pleasant experience!
Exactly what I was expecting – can’t help it but these guys never disappoint 🙂

Artifactory Power Pack – is it really powerful ?

March 8, 2010 4 comments

We have been using Artifactory in Thomson Reuters (ClearForest) for more than a year now.

My first Maven repository manager was Nexus – we were using it in my previous workplace. When I came to Thomson Reuters and started working on new CM infrastructures – I decided to switch to Artifactory, though. Mostly due to its richness of features and for supporting an efficient checksum-based storage model for binaries, which was the biggest difference for me between the two products.

We started with version 1.3 and then went through all major upgrades: 2.0, 2.1 (it was a big update: new searches, artifacts metadata, add-ons, and move/copy operations on artifacts). We’re now running version 2.2.1 with Add-ons Power Pack.

To tell you the truth – we really love it as it worked perfectly through the whole period (except in some cases where support and solutions were provided on the same day!).

Within time, I’ve also learned to appreciate how Artifactory goes away from being Maven-only repository manager and becomes a general-purpose storage manager for any kind of binaries and build system. Ivy and Gradle support (with new collaborative relationship just announced) is already available and it’s a really good start. Integration with Hudson and TeamCity (will be available soon) also comes very handy. In fact, one can store any kind of binaries in Artifactory, not related to either Maven or Java in any way!

In short, Artifactory knows (and willing) to cooperate with all major players on today’s arena and that’s a really impressive achievement for something that started as a free-time Maven repo manager project. Well done, guys!

But my particular interest in the last months was it’s Power Pack offering.
We run Hudson pretty intensively as our CI server so when JFrog-ers announced they have a special “Hudson support” – they certainly had my attention!

Now as we have it installed I’d like to see .. Does it matter? I mean, is it really that powerful? The short answer is yes, it does and yes, it is. There’s no doubt about it – see below.

To start with – Power Packs offers different things and even with all our appreciation to Artifactory we’re not using all of them. There’s simply no need for us to right now.
But what we do use is really saving us time (and, therefore, money) on a daily basis:

Now, let’s take each of them apart.

 

Hudson integration

This one is definitely the best and draws the most attention, for obvious reasons. The idea is both simple and ingenius – let’s ask CI server (Hudson/TeamCity/Bamboo) to push all build environment data to Artifactory! After all, when a build job is running – it has all environmental information one can think of : OS type and version, JVM version, modules built, their dependencies and versions … Until today all this information was buried somewhere
in Hudson logs and deleted, eventually (I mean, we do need to clean up our build logs sometimes, don’t we?)

Not any more – Hudson integration establishes a bi-directional link between Artifactory and Hudson for each job run. Finally, those two start talking to each other!

How it works:

  • Hudson Artifactory plugin is installed
  • Hudson is configured and Artifactory server is added
  • Hudson job is configured to run "mvn clean deploy install" 
       That’s right, we’re not using maven-deploy-plugin any more
  • Hudson job is configured to deploy to Artifactory server (specified previously) 
       and one of repos available – a nice drop-down list allows to choose it:

        Hudson Job Config

When (and if!) job finishes successfully – all artifacts archived during the build (<archivingDisabled> should be set to “false” in job’s “config.xml” but that’s a default value) will be deployed by Hudson to Artifactory in one go:

Hudson Deploy 

It’s not truly atomic (if the process fails in the middle for some reasons – my guess is nothing would be un-deployed) but it’s still much better than what Maven does by default: deploying each artifact the moment it is ready (so if build process fails in the middle – some newer artifacts would be deployed already while some would stay in the previous version).

As you see, in this sense – Hudson’s way of deploying to Artifactory is much better as it only starts when the build has finished successfully. On top of that, for some weird reason Maven’s traditional deploy has let me down recently with:

Error installing metadata: Error updating group repository metadata
The requested operation cannot be performed on a file with a user-mapped section open

Seems to be some corruption issue that I couldn’t solve.
But "Ok" – I thought to myself – "One more reason not to use ‘mvn deploy’"

Now, does it scale?

After all, Hudson needs to deploy all created artifacts – what if there are too many of them?
In our case, it scales pretty well and there’s no problem whatsoever – our biggest job is publishing 170+ artifacts this way and it works just fine.

Ok, so what else does it actually do?

A lot. First of all, you now have a link to Artifactory in Hudson’s job: 

Hudson Artifactory link

Once we follow it to Artifactory – we get to a page where all build environmental data is stored: 

 General Build Info

So we have a "Properties" section here with JVM and OS versions recorded (though I wish there were some more), a link back to the Hudson job (I told ya those two started talking to each other!) and, most importantly, "Published Modules"

 Published Modules

For each published module – all its dependencies are recorded as well if we ever need to go back in time and figure out what dependencies do we need to re-create the module: 

Published Modules Dependencies

Following "Show In Tree" link we come to the usual artifact’s location in one of our repos where there’s now a new "Builds" tab: 
 
Builds

As you see – we now have an exact, bi-directional and traceable information about all jobs that ever deployed our artifacts.

And, like I said, I think it’s a lot. Since now we know for each artifact how it ended up being in Artifactory, by whom and when. We know which job has created it and we know what else was published by this job. For me it’s like a difference between my dady’s old garage (where you can find everything but nobody has any idea how things came along) and my mom’s kitchen (where every little thing has an origin and owner).

Don’t you love it already?

 

Properties

Historically, Maven doesn’t add much information to an artifact when it’s deployed. Nor does it offer any way to do so. Of course, a certain amount of metadata is added to each *.jar created (like it’s original POM) and each artifact has a traditional
<groupId>:<artifactId>:<version>:<classifier> coordinates (which is a huge improvement since Ant, if we really want to look back for a moment).

But, unfortunately, it only goes so far.

What if we want to mark or tag or label (pick up your favorite name) an artifact ?
A group of artifacts?

How about setting a "product=true" property to those artifacts that are final products (and not intermediate jars) ? We may talk a lot about artifacts and things but after all – people need working products, right? Those having "qa.status=passed" label on them. Or at least "qa.status=ok", may be.

As we can "label" e-mails in Gmail (surprisingly, some people don’t – I think they don’t know what they’re missing) or "tag" Delicious links – I would love to do the same with artifacts!

Some of them I would like to label manually, like QA steps in product lifecycle:
"qa.status = New => Accepted => Rejected => Passed => Graduated (?)"

Other properties I would like to be set automatically, when artifact is deployed: "build.number=35", "product=true" (if artifact is a ZIP file), "jvm=1.6" and the like.

Not surprisingly, there are two ways to set properties in Artifactory:

  • Manual
  • Automatic

The manual process is demonstrated here and, basically, it goes like this:

  • Define a property set: qa.status, qa.version, qa.anything
  • Choose a possible value for each property: any value, single-select, multi-select
  • Update your repo definition to make this property set available for it
       Watch out! If you miss this step – it will not work (happened twice to me)
  • For any artifact or folder in the tree – go to the new "Properties" tab and add a property

I agree, it’s more similar to Outlook “categories” (than to Gmail labels) and is a little bit involved but .. ok, that’s how it works for now. May be it’ll improve.

Anyway, being a software developer for life – I’m naturally more interested in things happening automatically. So how do I set a property on artifact during the build process?

I want to specify a POM <property> that will become an Artifactory property!

The answer is matrix-params:

<distributionManagement>
    <repository>
        <id>qa-releases</id>
        <url>
http://srv/artifactory/qa-rel;buildNumber=${number};rev=${rev}</url>
    </repository>
</distributionManagement>

As you see, it is simply a pair of arguments added to the deployment repo definition. Their values are usually taken from regular Maven properties and can be updated by any POM.
For example, to implement our “Products vs rest of artifacts” vision – all we need to do is to add a "product=${product}" matrix param:

<distributionManagement>
    <repository>
        <id>qa-releases</id>
        <url>
http://srv/artifactory/qa-rel;product=${product}</url>
    </repository>
</distributionManagement>

Some top-level <parent> POM will have it set to "false":

<properties>
    <product>false</product>
</properties>

.. but those POMs packaging a final product will have it set to “true”:

<properties>
    <product>true</product>
</properties>

Can it be any simpler than that?!

Here’s a blog post demonstrating the same technique for the purpose of artifacts staging and promotion through tagging.

Today, there’s one problem here, though – if we use Hudson integration and switch to "Hudson deploy" (see above) – this <distributionManagement> tag isn’t worth a lot, is it?

And there’s no way to set up any matrix params from Hudson job configuration, where deployment repo is specified. The workaround is simple, though – one just needs to edit job’s “config.xml” (.hudson/jobs/JobName/config.xml) file manually and restart Hudson or “Manage Hudson” => "Reload Configuration from Disk":

<publishers>
    <org.jfrog.hudson.ArtifactoryRedeployPublisher>
        <details>
            <artifactoryName>
http://srv/artifactory</artifactoryName>
            <repositoryKey>qa-rel;product=${product}</repositoryKey>
        </details>
        <deployArtifacts>true</deployArtifacts>
        <username>..</username>
        <scrambledPassword>..</scrambledPassword>
    </org.jfrog.hudson.ArtifactoryRedeployPublisher>
</publishers>

The bug is opened so I’m sure it’ll be fixed soon.

Ok, so we have our properties (tags, labels) set – now what? How do we use them?
That’s exactly what the next slide is about …

 

Smart Searches

In Gmail, searching for labeled mails is a matter of typing "g+l+label" (btw, I much preferred the “Labs” version over the “graduated” one). In Artifactory it’s a little bit involved (again) but that’s due to the fact that Artifactory searches are much more capable.

I believe options provided today would satisfy the most demanding (and esoteric) "querist":

  • Quick Search
  • Class Search
  • GAVC Search
  • Property Search
  • POM/XML Search

The first three are pretty obvious and very helpful indeed. I use GAVC Search most of the time, and a Class Search occasionally. But it still amazes me how fast Artifactory scans through its indices to locate all instances of, say, Scanner class:
 Scanner Search

It is a new Property Search we’re after today – it allows combining a query composed of a number of properties.

Like searching for all artifacts where "qa.version=1.0.1" and "qa.status=In QA".
Or, simply put, “What’s being checked today for the upcoming ‘1.0.1’ release?”

QA Search

It doesn’t matter how properties were set (either manually or automatically) – we can search for all of them! For example, "build.name" and "build.number" are sent by Hudson automatically so we can search by "build.number" as well: 
 Build Number Search

Search results can also be added or subtracted from each other – this is useful when they need to be either expanded or filtered with additional queries. They can also be saved for later use to perform a single operation on all results, where options are “Move”, “Copy” and “Delete”.

The blog post I’ve mentioned already shows exactly that – how artifacts can be

  1. Searched for
  2. Promoted to another repository with "Move" / "Copy" operation

As you see, using Property Search anyone can find what he’s looking for (assuming properties were set in the first place, of course): be it a QA person, looking for the last binaries to download and test or a Dev manager, looking for the binaries being QA-ed now.

The last advanced search is POM/XML Search allowing to search through all POMs (in all or specific repos) with XPath queries. I’ve used it yesterday trying to find out which POMs were using some specific plugin. Normally, I just run a textual search on “pom.xml” files through the whole “trunk” and it takes .. well, quite a while, of course. With Artifactory – it can be done smarter and faster:

XML Search

As you see, "/project/build/plugins/plugin/artifactId" search does the job.
And, of course, it takes less time than an old-school Total Commander textual search
(I don’t even need to measure it – it’s seconds vs minutes!)

So far I didn’t encounter a case where Artifactory searches were not sufficient.
They’re always smart (though I would call it "capable") enough.

 

Watches

I suppose this one is the easiest to describe and, in fact, there’s probably no need to describe it at all. One can set up “watches” to be notified by e-mail when certain repository, folder or artifact has a “create” or “delete” operation performed on it:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following events have recently occurred in Artifactory on items you are watching:

Sun Jan 10 06:30:21 IST 2010 [user-name/XXX.XXX.XXX.XXX] [CREATED] libs-snapshots-local:com/clearforest/ProductsPage/8.0-SNAPSHOT/ProductsPage-8.0-SNAPSHOT.pom
Sun Jan 10 06:30:18 IST 2010 [user-name/XXX.XXX.XXX.XXX] [CREATED] libs-snapshots-local:com/clearforest/ProductsPage/8.0-SNAPSHOT/ProductsPage-8.0-SNAPSHOT.war
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Ironically, about a year ago – I was practically dying to get this notification: somehow, trunk POMs were overridden by an older versions and I suspected someone from the dev team running “mvn deploy” on an outdated sources (but this was not the case, actually).

So although I get quite a lot of e-mails from my “watches” – I just keep them in case anything like that will ever happen again. And of course, it becomes even more necessary when certain repos are used for special purposes, like in staging and promoting scenario.

 

Conclusions

I think “Power Pack” is a critical add-on to what Artifactory offers.

Being able to integrate it with Hudson, set custom properties and search for them is what makes it a nice, organized and watched storage rather than a kitchen sink of everything that happened to be downloaded from somewhere on the Internet.

Whether you have it or don’t – Artifactory surely delivers! But the questions are:

  • How aware are you of what’s happening?
  • How easy it is for you to dig through the mess and find what you need?

May be it’s just me, but I just love knowing what’s going on and when. It keeps me in control of things and not the other way around which I believe is a good thing, in general.

Happy building!