[DI] Email and PDF as a source

Does anybody know of a way to use BODI to load PDFs or Emails as a source?

I think in the case of an email it would be best to use a perl program to parse the text and pull out the data into an appropriate flat file format.

Not sure about the PDF’s.

Thanks


BODIpop (BOB member since 2005-02-14)

Are these emails stored in an email system (accessible thru POP/IMAP) or in text flat-files?

Can you give us some more context? What would you have in PDFs that would need to be brought into a database? Can whatever system that is generating the PDFs generate data in a different way?

I’ve often found that these integration problems require thinking “outside of the box”…


dnewton :us: (BOB member since 2004-01-30)

The emails are stored in an email system. i’m wondering if there’s anything that DI can do to get them out. Otherwise i’d have to get at them some other way and use something like Perl to format them in a way that DI can import them. What i’d like, and probably a few others would too, would be to create a datasource of type EMAIL or PDF.

Doubt it exists but if anyones handled anything like this before it would be nice to hear about it.

As far as PDF’s go let assume it’s just a normal PDF with some numbers in that I’d like to get hold of.

thanks


BODIpop (BOB member since 2005-02-14)

Hello,

this is quite an old thread.
However the question remains relevant I think?

Is there an Adapter Datastore that provides DI access to an IMAP mailbox?
Or is there another way to access emails via DI (without having to save them as text files first)?

best regards,
Ruben


rsa :belgium: (BOB member since 2008-07-11)

Not by default you can’t – we’ve done it by developing an adaptor using the SDK that connects to MS Exchange to retrieve the attachments from emails.


rollinsa (BOB member since 2006-07-11)