msgbartop
And you know what they call a… Quarter Pounder with Cheese in Paris?
msgbarbottom

06 Nov 08 My presentation @ ApacheCon US 2008

I am in New Orleans, attending a good ApacheCon US edition. Today I feel much more relaxed, after having given my talk yesterday afternoon.

I have published my presentation on SlideShare; if you are interested in Apache Tika, please have a look!

Content analysis for ECM with Apache Tika

View SlideShare presentation or Upload your own. (tags: apache tika)

Bookmark Me!
[Facebook]

24 May 08 Your ApacheCon US 2008 session proposal

ApacheConUS 2008 speaker

Well, the title of this post is cut and pasted from one of my last received emails.

Good news today by ApacheCon Schedule: I’ll be a speaker at the next ApacheConUS 2008 in New Orleas.

Here is an anticipation of what my session will be about. Please fell free to send me or comment here any suggestions you wish! I really would like to build the session presentation through a community-like process, i would like to be the spokesman of an interested community!
Do you like the idea?Here is the session abstract.

Content analysis for ECM with Apache Tika

Apache TIKAApache Tika is an extensible content analysis toolkit designed for detecting and extracting metadata and structured text content from a large number of document formats. It represents an higher level layer over existing parser libraries.
World-class content management systems, and most of all enterprise document management focused ones, always have to face the challenge of detecting, extracting and indexing as more various media content types as possible.
This session provides a technical presentation on how you can integrate Tika inside an Enterprise Document Management System, in order to centralize media type detection implementation and leverage dedicated parsers behind a common extraction layer.
Partecipants will also learn how Tika is integrated with Lucene in order to provide high performance document indexing and searching features.
As a real life demostration, the most recent supported media types, Office Open XML (OOXML), will be detected, extracted and indexed.

Bookmark Me!
[Facebook]