project-cdsware-users@cern.ch archives


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Howto link fulltexts to bibliographic notices


  • From: Tibor Simko <tibor.simko@xxxxxxx>
  • Subject: Re: Howto link fulltexts to bibliographic notices
  • Date: Mon, 08 Mar 2004 11:40:08 +0100

On Wed, 18 Feb 2004, Gil Bourgeois wrote:
> I'm wondering how to link one/several pdfs or any other 'fulltexts'
> to bibliographic notices.  For a project in my university I am
> investigating how to put online (~200 PDFs files with metadata).
>
> Ideally I would like that the PDFs are put in the right tree
> structure as if they were submitted via websubmit.

This technique is now possible with the release of CDSware v0.3.0.
Alternatively, if the documents are public, you can just store them
under any long-term public URI, and link them to your metadata via 856
tag, exactly as you describe.

> So one solution could be (maybe ?) :
>
> 1) Create the xml file with the available metadata (should I use
> Marc Tag '8564_' to link a fulltext to a metadata and how to do if
> there is more than one fulltext ?)

In the case of more than one URI you should just repeat the 856 tag
several times.  For example to add URIs to record #12345 you prepare
the file ``foo.xml'' containing:

  <controlfield tag="001">12345</controlfield>
  <datafield tag="856" ind1="4" ind2="">
  <subfield code="u">http://foo.com/bar1.pdf</subfield>
  <subfield code="z">Fulltext (Part I)</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2="">
  <subfield code="u">http://foo.com/bar2.pdf</subfield>
  <subfield code="z">Fulltext (Part II)</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2="">
  <subfield code="u">http://baz.com/quux/</subfield>
  <subfield code="z">Group Website</subfield>
  </datafield>

and then upload it like this:

  $ bibupload -c foo.xml

> 2) Create an archive, consisting of directories and textfiles to
> represent collection of pdfs (fulltexts) and previously created xml
> file.

With CDSware v0.3.0 the fulltext files would go into directories like
``<prefix>/var/data/files/g0/<docID>/'' and the tables
``bibrec_bibdoc'', ``bibdoc'', ``bibdoc_bibdoc'' would serve you to
link the files (docIDs) to the metadata (recIDs).  You could also try
to use the WebSubmit GUI interface and ``Submit a Revised Version''
functionality to attach fulltext files to your existing records via
Web GUI.  Thomas can give you more hints if you want to proceed this
way.

> 3) Run bibconvert/bibupload that should dispatch the data into the
> right table and put the pdfs (fulltexts) into the right tree
> structure and also rename the files according to the websubmit
> rules.

Yes.  The BibConvert step should not be necessary, since you can
prepare your 856 tags in the XML MARC format suitable for direct
uploading, as described above.

> 4) Run bibwords, bibreformat and webcoll.

Yes.  Note that if you use the approach #2 and not the approach #1,
the fulltext indexing and the format display will not work out of the
box yet, as BibIndex and BibFormat still look only for 856 when
deciding about the fulltext files.  The bibdoc tables will be taken
into consideration (in addition to 856) soon.