Biblio-Thesaurus-0.43/000755 000765 000024 00000000000 11764424700 014713 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/bin/000755 000765 000024 00000000000 11764424700 015463 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/Changes000644 000765 000024 00000012566 11764424630 016222 0ustar00ambsstaff000000 000000 Revision history for Perl extension Biblio::Thesaurus. 0.43 Fri Jun 8 17:34:05 WEST 2012 - r11219 - Fixed pod encoding issues. - Moved some scripts to examples folder. 0.42 Sat Jul 23 20:08:26 WEST 2011 - Removed stupid script from the module. 0.41 Sat Jul 23 19:32:45 WEST 2011 - Nothing relevant. Makefile.PL tweak. 0.40 Sat Feb 26 18:14:13 WET 2011 - Yet another missing file in MANIFEST 0.39 Fri Feb 25 21:17:41 WET 2011 - Added missing file to MANIFEST 0.38 Fri Feb 25 17:59:59 2011 - Fixed multiload (I hope). Added respective test. 0.37 Tue Nov 09 21:56:00 2010 - Added toHash method to export tree of relations; - Some more // for down-translation method that was issuing many warnings on undefined values. 0.36 Sun Sep 12 11:25:00 2010 - Require 5.10 (just one //=, but more later, I expect); 0.35 Fri Sep 10 19:33:00 2010 - Fixed encoding support. 0.34 Fri Sep 10 15:22:00 2010 - added method to export thesaurus in Json format 0.33 Fri Oct 03 17:07:00 2008 - Fixed bug in function terms; 0.32 Fri Sep 05 14:43:00 2008 - Added parser robustness; - Fixed Thesaurus2TeX; 0.31 Mon Aug 04 21:10:00 2008 - Stupid Mac extended attributes. Now fixed. 0.30 Mon Aug 04 18:01:02 2008 - added 'deleteRelation' method; - added extra functionality to 'hasRelation' method; 0.29 Thu Jul 10 18:20:00 2008 - removed all references to $', $` and $& (opefully a speed boost); - corrected a bug referring to the baselang without being in uppercase; - default baselang changed from ? to _; - support for multiple external relations with same relation identifier - fix transitive-closure not to enter endless loops; - fix thesaurusTranslate to treat correctly external relations; - added 'hasRelation' method; 0.28 Wed May 30 14:43:00 2007 - Added %enc encoding command; - Bug fixes; 0.27 Wed Jul 20 21:13:00 2005 - Forgot to add thesaurus2TMX to the MANIFEST file. Sorry 0.26 Wed Jul 20 8:26:00 2005 - Added XML::TMX optional dependency to install thesaurus2TMX script 0.25 Sat Apr 2 16:12:00 2005 - Added some utility string to be installed; - Support for multiline terms; - Multi-line continuations are handled better; 0.24 Thu Mar 24 21:48:00 2005 - Added some information missing on the POD; - New README file; 0.23 Wed Nov 10 10:44:00 2004 - Corrected bug dumping ISO thesaurus with external relationships; 0.22 Sat Nov 06 16:54:00 2004 - Corrected a test to run under Windows 0.21 Mon Oct 25 21:42:00 2004 - Moved Thesaurus.pm to lib/Biblio/Thesaurus.pm - Added some forgoten examples; 0.20 Mon Oct 25 21:29:00 2004 - Added tests for thesaurus union; - Corrected bug on thesaurus union; - Corrected bug on C<> method; 0.19 Wed Sep 29 17:25:00 2004 - Added pod and pod-coverage tests; - Improved documentation; - Removed some redundant code; 0.18 Wed Sep 29 16:25:00 2004 - Fixed problem with new 'append' function on CGI 0.17 Fri Oct 31 10:11:12 2003 - Bug fixes 0.16 Somewhere in 2003 - Use references 0.15 Fri Sep 15 02:03:04 2002 - Renamed to Biblio-Thesaurus - More options to navigate method (level1hide) - Ignore repeated terms; - Removed Library::MLang dependency; 0.14 Sat May 31 08:09:10 2002 - Added more options to navigate method (level2hide, level2size, scriptname) - Removed some warnings sent to STDOUT; 0.13 Thu Dec 19 02:03:04 2001 - Added the no title and the expand options to the navigate method; 0.12 Wed Dec 01 02:03:04 2001 - More tests - Some bug fixes 0.11 Fri Nov 02 08:32:99 2001 - Bug fixes - Added some error handling when loading ISOs - Type case maintained - More structure for multi-lingue thesaurus - Added some more tests 0.10 Sat Sep 08 21:59:59 2001 - extern class definition; - transitive closure function; - depthfirst function; - navigate function with custom expansion; - auto-completion after ISO load; - language class definition; - navigation method restructured for language usage; - navigation method does not require script name, anymore; - description command support multi-language; - renamed languages and setLanguages commands to interfaceLanguages and interfaceSetLanguages; - corrected bug on autocomplete function; - top command added 0.09 Fri Jan 19 20:01:10 2001 - Methods renamed without the thesaurus preffix; - thesaurusStore function renamed to storeOn; - thesaurusDump function renamed to save; - Function getHTMLTop; - i18n documentation; - dt and full_dt functions (see pod...). 0.08 Sun Jan 7 16:49:19 2001 - Multi-language support. 0.07 Fri Jan 5 11:10:09 2001 - Documentation in English. 0.06 Thu Jan 4 20:46:13 2001 - Renamed to be included under Library modules. 0.05 Fri Dec 29 17:58:00 2000 - Corrected HTML generated code; - Better documentation; - Bug-fixes. 0.04 Wed Dec 27 17:12:28 2000 - Functions: > thesaurusEdit; > thesaurusNavigate. - bug-fixes. 0.03 Wed Dec 27 10:00:00 2000 - Functions: > thesaurusAddTerm; > thesaurusAddRelations; > thesaurusDeleteTerm. 0.02 Tue Dec 26 15:36:09 2000 - Functions: > thesaurusLoad; > thesaurusComplete; > thesaurusGetHTMLTerm; > thesaurusDumpHTML; > thesaurusAddInverse; > thesaurusDump. 0.01 Tue Nov 28 15:51:04 2000 - original version; created by h2xs 1.19. Biblio-Thesaurus-0.43/docs/000755 000765 000024 00000000000 11764424700 015643 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/examples/000755 000765 000024 00000000000 11764424700 016531 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/INSTALL000644 000765 000024 00000001237 10730327653 015750 0ustar00ambsstaff000000 000000 =pod ----------------------------------------------------------------------------- Biblio::Thesaurus ----------------------------------------------------------------------------- To read this document in a more readable form, do perldoc INSTALL or convert it to HTML: pod2html INSTALL > file.html =head1 INSTALLATION Installation instructions: =head1 Needed Modules To use this module you need: =over 4 =item * Storable 1.0006 or newer (look at your nearest CPAN) =back =head1 Process After the modules installation, do: perl Makefile.PL make make test Now, as root, make install =cut Biblio-Thesaurus-0.43/lib/000755 000765 000024 00000000000 11764424700 015461 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/Makefile.PL000644 000765 000024 00000002044 11764424512 016666 0ustar00ambsstaff000000 000000 use ExtUtils::MakeMaker; use 5.010; my @scripts = ('bin/thesaurus2any', 'bin/thesaurus2htmls', 'bin/tag2thesaurus', 'bin/tax2thesaurus', 'bin/thesaurusTranslate', 'bin/thesaurus2tex'); my %reqmodules = ('Test::More' => 0, 'CGI' => 0, 'Storable' => '1.0006'); ## Test for XML::TMX my $V = eval { require XML::TMX; $XML::TMX::VERSION }; if ($@) { print "It seems you do not have XML::TMX. Not installing thesaurus2TMX script.\n"; } else { if ($V >= 0.12) { push @scripts, 'bin/thesaurus2TMX'; $reqmodules{'XML::TMX'} = '0.12'; print "XML::TMX found. Installing thesaurus2TMX script\n"; } else { print "It seems you do not have XML::TMX >= 0.12. Not installing thesaurus2TMX script.\n"; } } WriteMakefile( 'NAME' => 'Biblio::Thesaurus', 'VERSION_FROM' => 'lib/Biblio/Thesaurus.pm', 'ABSTRACT_FROM' => 'lib/Biblio/Thesaurus.pm', 'EXE_FILES' => [@scripts], 'PREREQ_PM' => {%reqmodules}, ); Biblio-Thesaurus-0.43/MANIFEST000644 000765 000024 00000001762 11764424700 016052 0ustar00ambsstaff000000 000000 Changes examples/animal.the examples/ex1.pl examples/ex2.pl examples/ex3.pl examples/ex4.pl examples/ex5.pl examples/example.pl examples/secondorder.the examples/thesaurus.english examples/thesaurus.portuguese bin/thesaurus2any bin/thesaurus2htmls examples/biling2thesaurus bin/tag2thesaurus examples/thesaurus2biling examples/thesaurusSC examples/thesaurusV bin/thesaurusTranslate bin/thesaurus2TMX bin/thesaurus2tex # bin/thesaurusBrowser INSTALL lib/Biblio/Thesaurus.pm Makefile.PL MANIFEST META.yml Module meta-data (added by MakeMaker) README t/00_simple.t t/01_construct.t t/10_join.t t/90_big.t t/a.the t/b.the t/dup.the t/pod-coverage.t t/pod.t t/02_empty.t t/02_empty.the t/03_toHash.t examples/animals2.iso examples/animals1.iso t/11_multiload.t examples/thesaurus2graphviz bin/tax2thesaurus bin/codetax2thesaurus bin/tageditor2thesaurus bin/thesaurus2tageditor docs/internals.pod examples/tabterm examples/ex.tax META.json Module JSON meta-data (added by MakeMaker) Biblio-Thesaurus-0.43/META.json000644 000765 000024 00000001702 11764424700 016334 0ustar00ambsstaff000000 000000 { "abstract" : "Perl extension for managing ISO thesaurus", "author" : [ "unknown" ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 6.62, CPAN::Meta::Converter version 2.120921", "license" : [ "unknown" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "Biblio-Thesaurus", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "runtime" : { "requires" : { "CGI" : "0", "Storable" : "1.0006", "Test::More" : "0", "XML::TMX" : "0.12" } } }, "release_status" : "stable", "version" : "0.43" } Biblio-Thesaurus-0.43/META.yml000644 000765 000024 00000001016 11764424700 016162 0ustar00ambsstaff000000 000000 --- abstract: 'Perl extension for managing ISO thesaurus' author: - unknown build_requires: ExtUtils::MakeMaker: 0 configure_requires: ExtUtils::MakeMaker: 0 dynamic_config: 1 generated_by: 'ExtUtils::MakeMaker version 6.62, CPAN::Meta::Converter version 2.120921' license: unknown meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 name: Biblio-Thesaurus no_index: directory: - t - inc requires: CGI: 0 Storable: 1.0006 Test::More: 0 XML::TMX: 0.12 version: 0.43 Biblio-Thesaurus-0.43/README000644 000765 000024 00000001307 10730327653 015575 0ustar00ambsstaff000000 000000 Biblio::Thesaurus version 0.24 ================================ Perl extension for managing ISO thesaurus ABSTRACT This module provides transparent methods to maintain Thesaurus files. The module uses a subset from ISO 2788 which defines some standard features to be found on thesaurus files. The module also supports multilingual thesaurus and some extensions to the ISOs standard. INSTALLATION To install this module, run the following commands: perl Makefile.PL make make test make install COPYRIGHT AND LICENSE Copyright 2000-2005 Natura Project, All Rights Reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Biblio-Thesaurus-0.43/t/000755 000765 000024 00000000000 11764424700 015156 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/t/00_simple.t000644 000765 000024 00000002714 11060233440 017123 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Data::Dumper; use strict; use Test::More tests => 11; # Check module loadability BEGIN { use_ok("Biblio::Thesaurus"); } # Check 'transitive closure' method my $thesaurus = thesaurusLoad('examples/thesaurus.portuguese'); my @ft= $thesaurus->tc("local","NT"); my $count = scalar(@ft); is(10,$count); # Check depth_first my $ft= $thesaurus->depth_first("local" , 2 ,"NT","INST"); my $k=keys(%$ft); is($k,10-1); # Check miscelaneous ok(defined($thesaurus->{baselang})); ok(defined($thesaurus->{languages}{$thesaurus->{baselang}})); # Check multi-lang support $thesaurus = thesaurusLoad("examples/animal.the"); is($thesaurus->{EN}{cat},"gato"); is($thesaurus->getdefinition("cat"),"gato"); # Check definition type comparison ok($thesaurus->isDefined('GaTo')); $thesaurus = thesaurusLoad('examples/thesaurus.portuguese'); # tests number 8 and 9 my @defineds = keys %{$thesaurus->{$thesaurus->{baselang}}}; my $true = 1; my $true2 = 1; my $term; while($term = shift @defineds) { $true = 0 unless $thesaurus->isDefined($term); $true2 = 0 unless $thesaurus->isDefined($thesaurus->_definition($term)); } ok($true); ok($true2); # Test dups on dup.the $thesaurus = thesaurusLoad('t/dup.the'); my @terms = $thesaurus->terms("a", "NT"); ok(! array_with_dups(\@terms)); sub array_with_dups { my ($a1) = @_; my (%h1); @h1{@$a1}=@$a1; for (@$a1) { if (exists $h1{$_}) { delete $h1{$_}; } else { return 1; # DUP } } return 0 } Biblio-Thesaurus-0.43/t/01_construct.t000644 000765 000024 00000004164 11062725454 017675 0ustar00ambsstaff000000 000000 # -*- cperl -*- use strict; use Test::More tests => 31; BEGIN { use_ok("Biblio::Thesaurus") } # Thesaurus is an object of Biblio::Thesaurus type my $the = thesaurusNew(); isa_ok($the, "Biblio::Thesaurus"); my @allterms; # Empty thesaurus is really empty @allterms = $the->allTerms; is_deeply([@allterms], []); ok(!$the->isDefined("foo")); # Addiction really adds... $the->addTerm("foo"); ok($the->isDefined("foo")); ok(!$the->isDefined("bar")); # deletion works $the->addTerm("bar"); $the->deleteTerm("foo"); ok(!$the->isDefined("foo")); ok($the->isDefined("bar")); # term listing works @allterms = $the->allTerms; is_deeply([@allterms], [qw/bar/]); # term listing gives all terms $the->addTerm("foo"); @allterms = $the->allTerms; is_deeply([sort @allterms], [qw/bar foo/]); $the->addRelation("foo", "BT", "ugh"); @allterms = $the->allTerms; is_deeply([sort @allterms], [qw/bar foo ugh/]); $the->addRelation("foo", "BT", qw/zbr1 zbr2 zbr3 zbr4/); @allterms = $the->allTerms; is_deeply([sort @allterms], [qw/bar foo ugh zbr1 zbr2 zbr3 zbr4/]); ok($the->hasRelation("foo", "BT", "zbr1")); ok(!$the->hasRelation("foo", "XX", "zbr1")); ok(!$the->hasRelation("foo", "BT", "zbr5")); ok($the->hasRelation("foo","BT","ugh")); $the->complete; ok($the->hasRelation("ugh","NT","foo")); $the->deleteRelation("foo","BT","ugh"); ok(!$the->hasRelation("foo","BT","ugh")); ok(!$the->hasRelation("ugh","NT","foo")); $the->addRelation("bar","SN","Uma scope note qualquer"); ok($the->hasRelation("bar","SN")); $the->deleteRelation("bar","SN"); ok(!$the->hasRelation("bar","SN")); $the->addRelation("bar","BT", "AA", "BB"); $the->complete; ok($the->hasRelation("AA","NT","bar")); ok($the->hasRelation("BB","NT","bar")); my @rels = $the->relations("bar"); ok(!grep { $_ eq "_NAME_"} @rels); ok(grep {$_ eq "BT"} @rels); $the->deleteRelation("bar","BT"); ok(!$the->hasRelation("AA","NT","bar")); ok(!$the->hasRelation("BB","NT","bar")); ok(!$the->hasRelation("bar","BT","AA")); ok(!$the->hasRelation("bar","BT","BB")); $the->deleteRelation("bar","BT"); isa_ok($the, "Biblio::Thesaurus"); $the->deleteRelation("bar","BT","AA"); isa_ok($the, "Biblio::Thesaurus"); Biblio-Thesaurus-0.43/t/02_empty.t000644 000765 000024 00000000645 11131216347 017001 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Data::Dumper; use strict; use Test::More tests => 6; # Check module loadability BEGIN { use_ok("Biblio::Thesaurus"); } # Check 'transitive closure' method my $thesaurus = thesaurusLoad('t/02_empty.the'); ok($thesaurus); ok(!exists($thesaurus->{PT}{a}{RT})); ok(!exists($thesaurus->{PT}{c}{POF})); ok(exists($thesaurus->{PT}{d})); ok(exists($thesaurus->{PT}{ff})); #print STDERR Dumper $thesaurus; Biblio-Thesaurus-0.43/t/02_empty.the000644 000765 000024 00000000156 11131216641 017310 0ustar00ambsstaff000000 000000 %baselang PT a NT Aa, Ab, Ac RT BT c SN text and more text ff c NT a POF SN text, text and more text dBiblio-Thesaurus-0.43/t/03_toHash.t000644 000765 000024 00000000464 11466341366 017104 0ustar00ambsstaff000000 000000 #!perl use strict; use warnings; use Biblio::Thesaurus; use Data::Dumper; use Test::More tests => 4; my $the = thesaurusLoad('t/b.the'); my $hash = $the->toHash("NT"); is(ref($hash), "HASH", "Is a hash reference"); is($hash->{b}{Bb}, "b::Bb"); is($hash->{b}{Ba}, "b::Ba"); is($hash->{b}{Bc}, "b::Bc"); Biblio-Thesaurus-0.43/t/10_join.t000644 000765 000024 00000001122 10730327650 016574 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Data::Dumper; use strict; use Test::More tests => 13; # Check module loadability use Biblio::Thesaurus; my $loaded = 1; ok(1); my $a = thesaurusLoad("t/a.the"); my $c = $a->appendThesaurus("t/b.the"); # print STDERR Dumper($c); ok($c->isDefined("a")); ok($c->isDefined("b")); ok($c->isDefined("c")); ok($c->isDefined("Aa")); ok($c->isDefined("Ab")); ok($c->isDefined("Ac")); ok($c->isDefined("Ba")); ok($c->isDefined("Bb")); ok($c->isDefined("Bc")); my @terms = $c->terms("c", "NT"); ok(grep {$_ eq "a"} @terms); ok(grep {$_ eq "b"} @terms); is(scalar(@terms), 2); Biblio-Thesaurus-0.43/t/11_multiload.t000644 000765 000024 00000000356 11531754750 017645 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Data::Dumper; use strict; use Test::More tests => 2; # Check module loadability use Biblio::Thesaurus; my $loaded = 1; ok(1); my $obj = thesaurusMultiLoad("examples/animals1.iso", "examples/animals2.iso"); ok($obj); Biblio-Thesaurus-0.43/t/90_big.t000644 000765 000024 00000001113 10730327650 016406 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Data::Dumper; use strict; use Test; my $NT; my $INT; BEGIN { my $ntestes = 5; $NT = 100000; $INT = 1000; plan tests => ($NT/$INT)*2 + $ntestes } # Check module loadability use Biblio::Thesaurus; my $loaded = 1; ok(1); my $t = thesaurusNew(); my $i; for $i (0..$NT) { ok(1) if !($i%$INT); $t -> addTerm("termo[${i}]"); } $t->storeOn("_${$}_"); ok(1); undef($t); ### THis frees the memory?? my $the = thesaurusRetrieve("_${$}_"); ok(1); unlink("_${$}_"); for $i (0..$NT) { ok(1) if !($i%$INT); last unless $the -> isDefined("termo[${i}]"); } Biblio-Thesaurus-0.43/t/a.the000644 000765 000024 00000000101 10730327650 016066 0ustar00ambsstaff000000 000000 %baselang PT a NT Aa, Ab, Ac c NT a SN text, text and more textBiblio-Thesaurus-0.43/t/b.the000644 000765 000024 00000000037 10730327650 016077 0ustar00ambsstaff000000 000000 %top b b NT Ba, Bb, Bc c NT bBiblio-Thesaurus-0.43/t/dup.the000644 000765 000024 00000000105 10730327650 016442 0ustar00ambsstaff000000 000000 %baselang PT a NT Aa, Ab, Ac, Aa c NT a SN text, text and more textBiblio-Thesaurus-0.43/t/pod-coverage.t000644 000765 000024 00000000444 11764424263 017724 0ustar00ambsstaff000000 000000 use Test::More; eval "use Test::Pod::Coverage 0.08"; #plan skip_all => "still not passing this test."; plan skip_all => "Test::Pod::Coverage 0.08 required for testing POD coverage" if $@; plan skip_all => "export AUTHOR_TEST for author tests" unless $ENV{AUTHOR_TEST}; all_pod_coverage_ok(); Biblio-Thesaurus-0.43/t/pod.t000644 000765 000024 00000000322 11764424307 016125 0ustar00ambsstaff000000 000000 use Test::More; eval "use Test::Pod 1.00"; plan skip_all => "Test::Pod 1.00 required for testing POD" if $@; plan skip_all => "export AUTHOR_TEST for author tests" unless $ENV{AUTHOR_TEST}; all_pod_files_ok(); Biblio-Thesaurus-0.43/lib/Biblio/000755 000765 000024 00000000000 11764424700 016661 5ustar00ambsstaff000000 000000 Biblio-Thesaurus-0.43/lib/Biblio/Thesaurus.pm000644 000765 000024 00000155512 11764424117 021215 0ustar00ambsstaff000000 000000 # -*- Mode: Perl; tab-width: 2; -*- package Biblio::Thesaurus; use 5.010; use strict; use warnings; require Exporter; use Storable; use CGI qw/:standard/; use Data::Dumper; # Version our $VERSION = '0.43'; # Module Stuff our @ISA = qw(Exporter); our %EXPORT_TAGS = ( 'all' => [ qw() ] ); our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); # We are working with an object oriented interface. This means, we only # need to export constructors. # # The last three variables are used to down-translation sub (downtr) our @EXPORT = qw( &thesaurusLoad &thesaurusLoadM &thesaurusNew &thesaurusRetrieve &thesaurusMultiLoad @terms $term $rel); our ($casesen,$rel,@terms,$term); ## # # sub top_name { topName(@_) } sub topName { my ($self, $name) = @_; if($name){ $self->{name} = $name;} else { return $self->{name};} } sub order { my ($self,@names) = @_; if(@names){ $self->{order} = [@names] ; } else { defined $self->{order} ? (@{$self->{order}}) : () } } sub isLanguage{ my ($self,$l) = @_; return defined $self->{languages}{$l} } sub languages{ my ($self,@names) = @_; if(@names){ for (@names) { $self->{languages}{$_} = 1; }} else { keys (%{$self->{languages}}) } } sub baselang { my ($self,$name) = @_; if($name){ $self->{$name} = $self->{$self->{baselang}}; delete $self->{$self->{baselang}}; $self->{baselang} = $name;} else {return $self->{baselang};} } ## # # sub terms { my ($self, $term, @rels) = @_; my $base = $self->{baselang}; return () unless $self->isDefined($term); $term = $self->_definition($term); @rels = map { uc $_ } @rels; return (map { if (defined($self->{$base}{$term}{$_})) { if (ref($self->{$base}{$term}{$_}) eq "ARRAY") { @{$self->{$base}{$term}{$_}} } else { ($self->{$base}{$term}{$_}) } } else { () } } @rels); } ## # Parece-me que não está a ser usada. # # sub external { # my ($self,$term,$external) = @_; # $external = uc($external); # $term = $self->definition($term); # return $self->{$self->{baselang}}{$term}{$external}; # } ### # # sub all_terms { allTerms(@_) } sub allTerms { my $self = shift; return sort keys %{$self->{$self->{baselang}}}; } ### # # sub depth_first { my ($self,$term,$niveis,@relat) = @_; my %st=(); if ($niveis>=1) { for ($self->terms($term,@relat)) { $st{$_}=depth_first($self,$_,$niveis-1,@relat); } \%st; } elsif($niveis == 0) {1} else {1} } ### # # sub _default_norelations { return { 'URL'=> 1, 'SN' => 1 }; } ### # # sub _default_inversions { +{ NT => 'BT', BT => 'NT', RT => 'RT', USE => 'UF', UF => 'USE' }; } ### # # sub _translateTerm { my ($self,$term,$lang,$dic) = @_; $dic = {} unless $dic; $lang = uc($lang); # Se foi $lang definido como linguagem if (defined($self->{languages}{$lang})) { my $trad; # Se existe a tradução if (defined($trad = $self->{$self->{baselang}}{$term}{$lang})) { return $trad; } } if(defined $dic->{$term}) {return $dic->{ $term}} if(defined $dic->{lcfirst($term)}) {return ucfirst($dic->{lcfirst($term)})} if(defined $dic->{lc($term)}) {return uc($dic->{ lc($term)})} return "[$self->{baselang}-$lang:".$self->getdefinition($term)."]"; } ### # # sub appendThesaurus { my ($self,$other) = @_; # This way we handle full thesaurus objects or simple filename unless (ref($other)) { $other = thesaurusLoad($other); } my $new; # Check if baselang is the same, or if some of them is undefined if ($self->{baselang} eq $other->{baselang}) { $new->{baselang} = $self->{baselang} } elsif ($self->{baselang} eq "_") { $new->{baselang} = $other->{baselang} } elsif ($other->{baselang} eq "_") { $new->{baselang} = $self->{baselang} } else { return undef; } # If some of the top is _top_, the other is choosed. If # there are two different tops, use the first ($self) one if ($other->{name} eq $self->{name}) { $new->{name} = $self->{name} } elsif ($other->{name} eq "_top_") { $new->{name} = $self->{name} } elsif ($self->{name} eq "_top_") { $new->{name} = $other->{name} } else { $new->{name} = $self->{name} } # VERSION: current module version $new->{version} = $VERSION; sub _ffjoin { # key, hash1ref, hash2ref my ($c,$a,$b) = @_; if (exists($a->{$c}) && exists($b->{$c})) { return {%{$a->{$c}},%{$b->{$c}}}; } elsif (exists($a->{$c})) { return {%{$a->{$c}}} } elsif (exists($b->{$c})) { return {%{$b->{$c}}} } else { return {} } } # Inverses: join hash tables... in conflict, $self is used $new->{inverses} = _ffjoin("inverses",$other,$self); # Descriptions: in conflict, $self is used $new->{descriptions} = _ffjoin("descriptions",$other,$self); # Externals: union $new->{externals} = _ffjoin("externals",$self,$other); # Languages: union $new->{languages} = _ffjoin("languages",$self,$other); # delete($new->{languages}{"_"}) if ($new->{baselang} ne "_"); # Get terms for the new thesaurus my @terms = _set_of(keys %{$self ->{$self ->{baselang}}}, keys %{$other->{$other->{baselang}}}); # Para cada termo do thesaurus... for my $term (@terms) { # existe em ambos... if ($self->isDefined($term) && $other->isDefined($term)) { my ($a_def,$b_def) = ($self->_definition($term), $other->_definition($term)); my $def = $a_def; $new->{defined}{lc($def)} = $def; my @class = _set_of(keys %{$self ->{$self ->{baselang}}{$a_def}}, keys %{$other->{$other->{baselang}}{$b_def}}); # para cada uma das suas relações... for my $class (@class) { if ($class eq "_NAME_") { # print STDERR Dumper($new->{$new->{baselang}}{$def}); # optar pela forma do thesaurus A $new->{$new->{baselang}}{$def}{_NAME_} = $def; } elsif ($new->{externals}{$class}) { if (exists($self->{$self->{baselang}}{$a_def}{$class})) { push @{$new->{$new->{baselang}}{$def}{$class}}, @{$self->{$self->{baselang}}{$a_def}{$class}}; } if (exists($other->{$other->{baselang}}{$b_def}{$class})) { push @{$new->{$new->{baselang}}{$def}{$class}}, @{$other->{$other->{baselang}}{$b_def}{$class}}; } } elsif ($new->{languages}{$class}) { $new->{$new->{baselang}}{$def}{$class} = "_"; } else { if (exists($self ->{$self ->{baselang}}{$a_def}{$class}) && exists($other->{$other->{baselang}}{$b_def}{$class})) { # Join lists my %there; @there{@{$self->{$self->{baselang}}{$a_def}{$class}}} = 1 x @{$self->{$self->{baselang}}{$a_def}{$class}}; push @{$new->{$new->{baselang}}{$def}{$class}}, keys %there; for (@{$other->{$other->{baselang}}{$b_def}{$class}}) { unless ($there{$_}) { push @{$new->{$new->{baselang}}{$def}{$class}}, $_; } $there{$_} = 1; } } elsif (exists($self->{$self->{baselang}}{$a_def}{$class})) { $new->{$new->{baselang}}{$def}{$class} = $self->{$self->{baselang}}{$a_def}{$class}; } else { ## other->b_def->class $new->{$new->{baselang}}{$def}{$class} = $other->{$other->{baselang}}{$b_def}{$class}; } } } } elsif ($self->isDefined($term)) { $new->{defined}{lc($term)} = $self->_definition($term); $new->{$new->{baselang}}{$term} = $self->{$self->{baselang}}{$term}; } else { ### $other->isDefined($term) $new->{defined}{lc($term)} = $other->_definition($term); $new->{$new->{baselang}}{$term} = $other->{$other->{baselang}}{$term}; } } return bless($new); } ### # # sub thesaurusMultiLoad { my @files = @_; my $self = thesaurusLoad(shift @files); while(@files) { $self->appendThesaurus(shift @files); } return $self; } ### # # sub top { my $self = shift; my $script = shift; return ""; } ### # # sub _default_descriptions { return { 'RT' => q/Related term/, 'TT' => q/Top term/, 'NT' => q/Narrower term/, 'BT' => q/Broader term/, 'USE' => q/Synonym/, 'UF' => q/Quasi synonym/, 'SN' => q/Scope note/, }; } sub setExternal { my ($self,@rels) = @_; for (@rels) { $self->{externals}{uc($_)} = 1; } return $self; } sub isExternal { my ($self,$ext) = @_; return (defined($self->{externals}{uc($ext)}) && defined($self->{externals}{uc($ext)}) == 1); } ### # # sub thesaurusNew { my $obj = { # thesaurus => {}, inverses => _default_inversions(), descriptions => _default_descriptions(), externals => _default_norelations(), name => '_top_', baselang => '?', languages => {}, version => $VERSION, prefix => "", }; # bless and return it! Amen! return bless($obj); } ### # # sub storeOn { store(@_); } ### # # sub thesaurusRetrieve { my $file = shift; my $obj = retrieve($file); if (defined($obj->{version})) { return $obj; } else { die("Rebuild your thesaurus with a recent Biblio::Thesaurus version"); } } ### # # sub _trurl { my $t = shift; $t =~ s/\s/+/g; return $t; } ### # # sub getHTMLTop { my $self = shift; my $script = shift || $ENV{SCRIPT_NAME}; my $t = ""; return $t; } ### # # sub thesaurusLoad { my %opt =(); # completed => 1 if(ref($_[0]) eq "HASH") {%opt = (%opt , %{shift(@_)}) } ; my ($file,$self) = @_; my %thesaurus; unless($self){ $self->{inverses} = _default_inversions(); $self->{descriptions} = _default_descriptions(); $self->{externals} = _default_norelations(); $self->{name} = "_top_"; $self->{baselang} = "_"; $self->{languages} = {}; $self->{defined} = {}; $self->{version} = $VERSION; } else { $self->{defined} = {}; } # Open the thesaurus file to load open ISO, $file or die (q/Can't open thesaurus file/); ### binmode(ISO,"$opt{encoding}:") if($opt{encoding}); # While we have commands or comments or empty lines, continue... while(($_ = )=~/(^(%|#))|(^\s*$)/) { chomp; if (/^%\s*inv(?:erse)?\s+(\S+)\s+(\S+)/) { # Treat the inv*erse command $self->{inverses}{uc($1)} = uc($2); $self->{inverses}{uc($2)} = uc($1); } elsif (/^%\s*enc(oding)?\s+(\S+)/) { $self->{encoding} = lc $2; $self->{encoding} =~ s/_/-/g; binmode ISO, ":encoding($self->{encoding})"; } elsif (/^%\s*tit(le)?\s+(.+)/) { $self->{title} = $2; } elsif (/^%\s*aut(hor)?\s+(.+)/) { $self->{author} = $2; } elsif (/^%\s*desc(ription)?\[(\S+)\]\s+(\S+)\s+(.*)$/) { # Treat the desc*cription [lang] command.... 'RT EN' $self->{descriptions}{uc($3)." ".uc($2)} = $3; } elsif (/^%\s*desc(ription)?\s+(\S+)\s+(.*)$/) { # Treat the desc*cription command $self->{descriptions}{uc($2)} = $3; } elsif (/^%\s*ext(ernals?)?\s+(.*)$/) { # Treat the ext*ernals command chomp(my $classes = uc($2)); for (split /\s+/, $classes) { $self->{externals}{$_} = 1; } } elsif (/^%\s*lang(uages?)?\s+(.*)$/) { # Treat the lang*uages command chomp(my $classes = uc($2)); for (split /\s+/, $classes) { $self->{languages}{$_} = 1; } } elsif (/^%\s*top\s+(.*)$/) { $self->{name} = $1; } elsif (/^%\s*baselang(uage)?\s+(\S+)/) { $self->{baselang} = uc($2); } elsif (/^%/) { print STDERR "Unknown command: '$_'\n\n"; } else { # It's a comment or an empty line: do nothing } } # Redefine the record separator my $old_sep = $/; $/ = ""; # The last line wasn't a comment, a command or an empty line, so use it! $_ .= ; my $ncommands = $.-1; # While there are definitions... do { # define local variables my ($class,$term,$relations); ## Concat lines that continue back in one s/\n[ \t]+/ /g; # Can't use \s because "\n" =~ m!\s! # The first line contains the term to be defined /(.+)(?:\n((.|\n)+)|\n?$)/; $term = $1; $relations = $2 || ""; # If the term is all spaces, go back... if ($term =~ /^\s+$/) { print STDERR "Term with only spaces ignored at block term ",$.-$ncommands,"\n\n"; $term = '#zbr'; # This makes the next loop think this is a comment and ignore it } # Let's see if the term is commented... unless ($term =~ /^#/) { $term = _term_normalize($term); $term = $self->{defined}{lc($term)} if ($self->{defined}{lc($term)}); $thesaurus{$term}{_NAME_} = $term; $self->{defined}{lc($term)} = $term; # The remaining are relations $_ = $relations; # OK! The term is *not* commented... # For each definition line... $_.="\n" unless /\n$/; while (/((([^#\s]+)|#)[ \t]*(.*)\n)/g) { next unless $4; # Is it commented? unless ($2 eq "#") { # it seems not... set the relation class $class = uc($2); # || $class;... now multiline are handled before this print STDERR "** WARNING **: '$1'\n" unless $class; # See if $class has a description $self->{descriptions}{$class} = ucfirst(lc($class)) unless defined $self->{descriptions}{$class}; ## $descs->{$class}= ucfirst(lc($class)) unless(defined($descs->{$class})); # divide the relation terms by comma unless it is a language or extern relation if ( exists($self->{externals}{$class}) && defined($self->{externals}{$class}) ) { ## $thesaurus{$term}{$class}.= ($2?"$4":" $4"); ## $thesaurus{$term}{$class}.= ($thesaurus{$term}{$class}?" $4":"$4"); push @{$thesaurus{$term}{$class}}, $4; } elsif (exists($self->{languages}{$class}) && defined($self->{languages}{$class})) { # $translations->{$class}->{_term_normalize($4)}.=$term; $self->{$class}{$4}.=$term; $self->{defined}{_term_normalize(lc($4))} = $term; $thesaurus{$term}{$class} = $4; } else { push(@{$thesaurus{$term}{$class}}, map { _term_normalize($_) } split(/\s*,\s*/, $4)); } } } } } while(); # Close the ISO thesaurus file close ISO; # revert to the old record separator. Not needed, but beautifer. $/ = $old_sep; $self->{$self->{baselang}} = \%thesaurus; $self->{languages}{$self->{baselang}} = 1; # bless and return the thesaurus! Amen! if (exists($opt{completed}) && $opt{completed}) { return bless($self); } else { return complete(bless($self)); } } sub _lc{ if($casesen){$_[0]} else {lc($_[0])} } sub thesaurusLoadM { my $file = shift; my ($t,$rs)= _treatMetas1(thesaurusLoad($file)); if(@$rs){ undef $t->{$t->{baselang}}; undef $t->{defined}; _treatMetas2(thesaurusLoad($file,$t),$rs);} else{$t} } sub _treatMetas1 { my $t = shift; my @ts=(); my %r=(); if(@ts=$t->terms("_order_","NT")) { $t->order(@ts); @r{@ts,"_order_"}=(@ts,1) } if(@ts=$t->terms("_external_","NT")){ $t->setExternal(@ts); @r{@ts,"_external_"}=(@ts,1) } if(@ts=$t->terms("_top_","NT")) { $t->topName($ts[0]); $r{"_top_"}=1 } if(@ts=$t->terms("baselang_","NT")){ $t->baselang($ts[0]); @r{@ts,"baselang_"}=(@ts,1) } if(@ts=$t->terms("_language_","NT")){ $t->languages(@ts); @r{@ts,"_language_"}=(@ts,1) } if(@ts=$t->terms("_symmetric_","NT")){ for(@ts){ $t->addInverse($_,$_);} @r{@ts,"_symmetric_"}=(@ts,1) } # for each new relation describe it, add Invers and remove it as Term if(@ts=$t->terms("_relation_","NT")){ $r{"_relation_"}=1 ; $t->downtr( { SN => sub{ $t->describe({rel => $term, desc=>$terms[0]}) }, ## FALTA A LINGUA INV => sub{ $t->addInverse($term,$terms[0])}, RANG => sub{ $t->setExternal($term)}, -order => ["SN","INV"], -eachTerm => sub{ $r{$term}=$term }, }, @ts); } ($t,[(keys %r)]); } sub _treatMetas2{ my ($t,$rs)= @_; for (@$rs){ $t->deleteTerm($_)} $t; } ### # # sub getDescription { my ($obj, $rel, $lang) = @_; if (defined($lang)) { my $x = uc($rel)." ".uc($lang); return exists($obj->{descriptions}->{$x})?$obj->{descriptions}->{$x}:"..."; } else { my $x = uc($rel)." ".uc($obj->{baselang}); if (exists($obj->{descriptions}->{$x})) { return $obj->{descriptions}->{$x}; } elsif (exists($obj->{descriptions}->{$rel})) { return $obj->{descriptions}->{$rel}; } else { return "..."; } } } ### # # sub describe { my ($obj, $conf) = @_; my ($class, $desc, $lang); return unless ($class = uc($conf->{rel})); return unless ($desc = $conf->{desc}); if ($conf->{lang}) { $lang = " ".uc($conf->{lang}); } else { $lang = ""; } $obj->{descriptions}->{$class.$lang}=$desc; } ### # # sub addInverse { my ($obj,$a,$b) = @_; $a = uc($a); $b = uc($b); $obj->{descriptions}{$a}="..." unless(defined($obj->{descriptions}{$a})); $obj->{descriptions}{$b}="..." unless(defined($obj->{descriptions}{$b})); for (keys %{$obj->{inverses}}) { delete($obj->{inverses}{$_}) if (($obj->{inverses}{$_} eq $a) || ($obj->{inverses}{$_} eq $b)); } $obj->{inverses}{$a}=$b; $obj->{inverses}{$b}=$a; } ### # # sub meta2str { my $obj = shift; my $term; my %inverses = %{$obj->{inverses}}; my %descs = %{$obj->{descriptions}}; my $t = ""; # Save the 'encoding' command # $t.="\%encoding $obj->{encoding}\n\n" if defined $obj->{encoding} ; # Save the 'title' command # $t.="\%title $obj->{title}\n\n" if defined $obj->{title}; # Save the 'author' command # $t.="\%author $obj->{author}\n\n" if defined $obj->{author}; # Save the externals commands # $t.= "\%externals " . join(" ",keys %{$obj->{externals}}); $t.="\n\n"; # Save the languages commands # $t.= "\%languages " . join(" ",keys %{$obj->{languages}}); $t.="\n\n"; # Save the 'top' command # $t.="\%top $obj->{name}\n\n" if $obj->{name} ne "_top_"; # Save the 'baselanguage' command # $t.="\%baselanguage $obj->{baselang}\n\n" if $obj->{baselang} ne "_"; # Save the inverses commands # for $term (keys %inverses) { $t.= "\%inverse $term $inverses{$term}\n"; } $t.="\n\n"; # Save the descriptions commands # for $term (keys %descs) { if ( $term =~ /^(\w+)\s+(\w+)$/ ) { $t.= "\%description[$2] $1 $descs{$term}\n"; } else { $t.= "\%description $term $descs{$term}\n"; } } $t.="\n\n"; $t; } ## # # sub save { my $obj = shift; my $file = shift; my ($term,$class); my %thesaurus = %{$obj->{$obj->{baselang}}}; my $t = meta2str($obj); #save the metadata # Save the thesaurus # for $term (keys %thesaurus) { $t.= "\n$thesaurus{$term}{_NAME_}\n"; for $class ( keys %{$thesaurus{$term}} ) { next if $class eq "_NAME_"; if(defined $obj->{languages}{$class}) { $t.= "$class\t$thesaurus{$term}->{$class}\n"; } else { # if save_compact, juntar por ',' as relacoes nao external $t.= "$class\t$_\n" for (@{$thesaurus{$term}{$class}}); } } } open F, ">$file" or return 0; if (defined $obj->{encoding}) { $obj->{encoding} = lc($obj->{encoding}); $obj->{encoding} =~ s/_/-/g; binmode(F,":encoding($obj->{encoding})") ; } print F $t; close F; return 1; } ### # # sub navigate { # The first element is the object reference my $obj = shift; # This is the script name my $script = $ENV{SCRIPT_NAME} || ""; # Get the configuration hash my $conf = {}; if (ref($_[0])) { $conf = shift } my $expander = $conf->{expand} || []; my @tmp = map {$obj->{inverses}{$_}} @$expander; my $language = $conf->{lang} || undef; my $second_level_limit = $conf->{level2size} || 0; my $hide_on_first_level = $conf->{level1hide} || []; my $hide_on_second_level = $conf->{level2hide} || \@tmp; my $capitalize = $conf->{capitalize} || 0; my $topic = $conf->{topic_name} || "t"; my %hide; @hide{@$hide_on_first_level} = @$hide_on_first_level; $script = $conf->{scriptname} if (exists($conf->{scriptname})); my %param = @_; my $term; my $show_title = 1; if (exists($param{$topic})) { $param{$topic} =~ s/\+/ /g; $term = $obj->getdefinition($param{$topic}); } else { $show_title = 0 if exists($conf->{title}) && $conf->{title} eq "no"; if ($obj->isDefined($obj->{name})) { $term = $obj->{defined}{lc($obj->{name})}; } else { $term = '_top_'; } } my (@terms,$html); # If we don't have the term, return only the title return h2($term) unless ($obj->isDefined($term)); # Make the page title $html = h2(capitalize($capitalize, $obj->_translateTerm($term,$language))) if $show_title; # Get the external relations my %norel = %{$obj->{externals}}; # Now print the relations my $rel; for $rel (keys %{$obj->{$obj->{baselang}}{$term}}) { # next iteraction if the relation is the _NAME_ next if ($rel eq "_NAME_"); # Next if we want to hide it next if exists $hide{$rel}; # This block jumps if it is an expansion relation next if grep {$_ eq uc($rel)} @{$expander}; # The externs exceptions... if (exists($norel{$rel})) { # It's an external, so... # # Its description is "..."? my $desc = $obj->getDescription($rel, $language); $html .= join("
\n", map { b($desc)." $_" } @{$obj->{$obj->{baselang}}{$term}{$rel}}); $html .= " ".br; } elsif (exists($obj->{languages}{$rel})) { ## This empty block is used for languages translations } else { ## OK! It's a simple relation # There is a translation for the *relation* description? my $desc = $obj->getDescription($rel, $language); if ($desc eq "...") { $html .= b($rel)." "; } else { $html.= b($desc)." "; } # Now, write each term with a thesaurus link $html.= join(", ", map { my $term = $_; my $link = $term; $link =~ s/\s/+/g; $term = $obj->_translateTerm($term, $language); a({ href=>"$script?$topic=$link"},$term) } sort {lc($a)cmp lc($b)} @{$obj->{$obj->{baselang}}{$term}{$rel}}); $html.= br; } } # Now, treat the expansion relations for $rel (@{$expander}) { $rel = uc($rel); if (exists($obj->{$obj->{baselang}}{$term}{$rel})) { @terms = sort {lc($a)cmp lc($b)} @{$obj->{$obj->{baselang}}{$term}{$rel}}; $html.= ul(li([map { _thesaurusGetHTMLTerm($_, $obj, $script, $language, $second_level_limit, $hide_on_second_level); } @terms])) if (@terms); } } return $html; } ### # # sub toTex{ my $self = shift; my $_corres = shift || {}; my $mydt = shift || {}; # my $a; my %descs = %{$self->{descriptions}}; my $procgr= sub { my $r="";# my $a; my $auxrel = $descs{$rel} || $rel; $auxrel =~ s/_/ /g; $auxrel = ucfirst(lc($auxrel)); my $ki = $_corres->{$rel}->[0] || "\\\\\\emph{$auxrel} -- " ; my $kf = $_corres->{$rel}->[1] || "\n"; $r = "\\item[$ki]" . join(' $\diamondsuit$ ',(sort {lc($a) cmp lc($b)} @terms)) if @terms; }; $self->downtr( { '-default' => $procgr, '-end' => sub{s/_/\\_/g; "\\begin{description}\n$_\\end{description}\n"}, '-eachTerm' => sub{"\n\\item[$term]~\\begin{description}\n$_\\end{description}\n"}, (defined $self->{order}?(-order => $self->{order}):()), (%$mydt) } ); } sub toXml{ my $self = shift; my $_corres = shift || {}; my $mydt = shift || {}; my $a; my $proc= sub { my $r=""; my $a; my $ki = $_corres->{$rel}->[0] || "$rel" ; my $kf = $_corres->{$rel}->[1] || "/$rel"; for $a (@terms){ $r .= " <$ki>$a<$kf>\n";}; $r; }; $self->downtr({ '-default' => $proc, '-eachTerm' => sub{" \n <$self->{baselang}>$term{baselang}>\n$_ \n"}, '-end' => sub{"\n$_\n"}, (%$mydt) }); } ### # # sub dumpHTML { my $obj = shift; my %thesaurus = %{$obj->{$obj->{baselang}}}; my $t = ""; for (keys %thesaurus) { $t.=_thesaurusGetHTMLTerm($_,$obj,"",$obj->{baselang}); } return $t; } ### # # sub relations { my ($self,$term) = @_; return sort grep { $_ !~ /^_/ } keys %{$self->{$self->{baselang}}->{$term}} } ### # # Given a term, return it's information (second level for navigate) sub _thesaurusGetHTMLTerm { my ($term,$obj,$script,$language,$limit,$hide) = @_; my @rels2hide = map {uc} (defined($hide))?@$hide:(); my %rels2hide; @rels2hide{@rels2hide}=1; # Put thesaurus and descriptions on handy variables my %thesaurus = %{$obj->{$obj->{baselang}}}; my %descs = %{$obj->{descriptions}}; # Check if the term exists in the thesaurus if ($obj->isDefined($term)) { $term = $obj->{defined}{lc($term)}; my ($c,$t,$tterm); my $link = $term; $link =~ s/\s/+/g; $tterm = $obj->_translateTerm($term,$language); $t = b(a({href=>"$script?t=$link"},$tterm)). br . "
\n"; for $c (sort keys %{$thesaurus{$term}}) { $c = uc($c); next if exists($rels2hide{$c}); # jump if it is the name relation :) next if ($c eq "_NAME_"); if (exists($obj->{externals}{$c})) { # put an external relation my $desc = $obj->getDescription($c,$language); if ($desc eq "...") { $t .= join("
\n", map { div($_) } @{$thesaurus{$term}{$c}}); } else { $t .= join("
\n", map { b($desc)." $_" } @{$thesaurus{$term}{$c}}); } } elsif (exists($obj->{languages}{$c})) { # Jump the language relations } else { my $desc = $obj->getDescription($c,$language); if ($desc eq "...") { $t.= b($c)." "; } else { $t.= b($desc)." "; } my @termos = sort {lc($a)cmp lc($b)} ( @{$thesaurus{$term}{$c}} ); if (defined($limit) && $limit!=0 && @termos > $limit) { while(@termos > $limit) { pop @termos; } push @termos, "..."; } if (defined($script)) { @termos = map {my $link = $_; if ($link eq "...") { $link } else { $_ = $obj->_translateTerm($_,$language) || $_; $link =~s/\s/+/g; a({href=>"$script?t=$link"},$_) } } @termos; } $t.= join(", ", @termos) . br."\n"; } } $t.= "
\n"; return $t; } else { print STDERR "Can't find term '$term'\n"; return qq/Term $term is not defined\n/; } } sub getdefinition { getDefinition(@_) } sub getDefinition { my $self = shift; my $term = _term_normalize(lc(shift)); if ($self->isDefined($term)) { return $self->{defined}{$term}; } else { return $term; } } ### # # sub isDefined { my $obj = shift; my $term = _term_normalize(lc(shift)); return defined($obj->{defined}{$term}); } ### # # sub _definition { my ($self,$term) = @_; return $self->{defined}{_term_normalize(lc($term))}; } ### # # sub complete { my $obj = shift; my $thesaurus = $obj->{$obj->{baselang}}; my %inverses = %{$obj->{inverses}}; my ($termo,$classe); # para cada termo for $termo (keys %$thesaurus) { # $obj->{defined}{lc($termo)} = $termo; # e para cada classe, for $classe (keys %{$thesaurus->{$termo}}) { # verificar se existem duplicados... if (ref($thesaurus->{$termo}{$classe}) eq "ARRAY") { my %h; @h{@{$thesaurus->{$termo}{$classe}}} = @{$thesaurus->{$termo}{$classe}}; $thesaurus->{$termo}{$classe} = [ keys %h ]; # se tiver inverso, if (defined($inverses{$classe})) { # completar cada um dos termos relacionados for (@{$thesaurus->{$termo}{$classe}}) { # %thesaurus = _completa($obj,$_,$inverses{$classe},$termo,%thesaurus); _completa($obj,$_,$inverses{$classe},$termo,$thesaurus); } } } } } $obj -> {$obj->{baselang}} = $thesaurus; return $obj; } ### # # sub _completa { ## Yeah, obj and thesaurus can be redundanct, but it's better this way... my ($obj,$palavra,$classe,$termo,$thesaurus) = @_; my $t; # Ver se existe a palavra e a classe no thesaurus if ($obj->isDefined($palavra)) { $t = $obj->{defined}{lc($palavra)}; if (defined($thesaurus->{$t}{$classe})) { # se existe, o array palavras fica com os termos (para ver se ja' existe) my @palavras = @{$thesaurus->{$t}{$classe}}; # ver se ja' existe for (@palavras) { return $thesaurus if (lc eq lc($termo)); } } # nao existe: aumentar push @{$thesaurus->{$t}{$classe}}, $obj->{defined}{lc($termo)}; } else { # nao existe: aumentar $thesaurus->{$palavra}{_NAME_} = $palavra unless defined($thesaurus->{$palavra}) && defined($thesaurus->{$palavra}{_NAME_}); $obj->{defined}{lc($palavra)} = $palavra; push @{$thesaurus->{$palavra}{$classe}}, $obj->{defined}{lc($termo)}; } return $thesaurus; } ### # # sub addTerm { my $obj = shift; my $term = _term_normalize(shift); $obj->{$obj->{baselang}}{$term}{_NAME_} = $term; $obj->{defined}{lc($term)} = $term; } sub hasRelation { my ($obj, $term, $rel, $rterm) = @_; $rel = uc($rel); return 0 unless $obj->isDefined($term); # Check if term exists $term = $obj->_definition($term); my $has = 0; if ($rterm) { if (exists($obj->{externals}{$rel})) { $has = 1 if (grep { $_ eq $rterm } @{$obj->{$obj->{baselang}}{$term}{$rel}}); } else { $rterm = _term_normalize($rterm); $has = 1 if (grep { $_ eq $rterm} @{$obj->{$obj->{baselang}}{$term}{$rel}}); } } else { $has = 1 if exists($obj->{$obj->{baselang}}{$term}{$rel}); } return $has; } ### # # sub addRelation { my ($obj, $term, $rel, @terms) = @_; $rel = uc($rel); $obj->{descriptions}{$rel} = "..." unless defined($obj->{descriptions}{$rel}); unless ($obj->isDefined($term)) { $obj->{defined}{lc(_term_normalize($term))} = _term_normalize($term); } $term = $obj->_definition($term); if (exists($obj->{externals}{$rel})) { push @{$obj->{$obj->{baselang}}{$term}{$rel}}, @terms; } else { push @{$obj->{$obj->{baselang}}{$term}{$rel}}, map {_term_normalize($_)} @terms; for (@terms) { $obj->addTerm($_) unless $obj->isDefined($_); } } } ### # # sub deleteRelation { my ($self, $term, $rel, @terms) = @_; $rel = uc($rel); if (@terms) { for my $oterm (@terms) { $self->_deleteRelation($term, $rel, $oterm); ## Se existe inversa, do the same shit if (exists $self->{inverses}{$rel}) { $self->_deleteRelation($oterm, $self->{inverses}{$rel}, $term); } } } else { if (exists($self->{externals}{$rel})) { $self->_deleteRelation($term, $rel); } else { @terms = $self->terms($term,$rel); return unless @terms; $self->deleteRelation($term, $rel, @terms); } } } ### # # sub _deleteRelation { my ($obj, $term, $rel, $oterm) = @_; # return if the term is not defined return unless $obj->isDefined($term); $term = $obj->_definition($term); if ($oterm) { # if we have a full relation (term,rel,term), then it is not an external relation return if exists($obj->{externals}{$rel}); $oterm = _term_normalize($oterm); $obj->{$obj->{baselang}}{$term}{$rel} = [ grep { $_ ne $oterm } @{$obj->{$obj->{baselang}}{$term}{$rel}}]; } else { delete($obj->{$obj->{baselang}}{$term}{$rel}); } } ### # # sub deleteTerm { my $obj = shift; my $term = _term_normalize(shift); my $t2=$term; $term = $obj->_definition($term); my ($t,$c); warn("'$t2' => '$term'\n") && return unless defined($term); if (defined($obj->{$obj->{baselang}}{$term})){ delete($obj->{$obj->{baselang}}{$term}); delete($obj->{defined}{lc($term)}); } else {warn ("'$term' not found...\n");} foreach $t (keys %{$obj->{$obj->{baselang}}}) { foreach $c (keys %{$obj->{$obj->{baselang}}{$t}}) { my @a = (); if ( ref($obj->{$obj->{baselang}}{$t}{$c}) eq "ARRAY") { foreach (@{$obj->{$obj->{baselang}}{$t}{$c}}) { push(@a,$_) unless($_ eq $term); } $obj->{$obj->{baselang}}{$t}{$c}=\@a; } } } } ### # # sub downtr { my $self = shift; my $handler = shift; die("bad use of downtr method; args should be: hashRef, termlist") unless(ref($handler) eq "HASH"); my @tl = @_ ; #lc(shift); @tl = (sort {lc($a) cmp lc($b)} keys %{$self->{$self->{baselang}}}) unless (@tl); my $r2 = ""; #final result my $c; for my $t (@tl){ my $r = ""; $term = $t; if (defined( $handler->{"_NAME_"})){ $r .= &{$handler->{"_NAME_"}}; } my @rels = (keys %{$self->{$self->{baselang}}->{$t}}); my %rels = (); @rels{@rels} = @rels; my $order = defined $handler->{-order} ? $handler->{-order} : ( defined $self->{order} ? $self->{order} : []); delete(@rels{@$order}); @rels = ( @{$order}, (sort keys(%rels) )); for $c (@rels) { next unless $self->{$self->{baselang}}{$t}{$c}; next if ($c eq "_NAME_"); # Set environment variables to downtr function # # rel... # $rel = $c; # # List of terms... # if ($self->{languages}->{$rel}) { @terms = ( $self->{$self->{baselang}}{$t}{$rel} ); } else { @terms = @{$self->{$self->{baselang}}{$t}{$rel}}; } # # Current term... # $term = $t; if (exists($handler->{$rel})) { $r .= &{$handler->{$rel}} // ""; } elsif (exists($handler->{-default})) { $r .= &{$handler->{-default}} // ""; } else { $r .= "\n$rel\t".join(", ",@terms); } } for($r){ if (exists($handler->{'-eachTerm'})) { my $ans = &{$handler->{'-eachTerm'}}; $r2 .= ($ans)?$ans:""; } else { $r2 .= $_; } } } if (defined($handler->{-end})) { for($r2){ $_ = &{$handler->{'-end'}} } } $r2; } ### # # sub tc{ my ($self,$term,@relations) = @_; my %x = _tc_aux($self, $term, {}, @relations); return (keys %x); } ### # # sub toHash { my ($self, $rel) = @_; $rel //= "NT"; $rel = [$rel] unless ref($rel); my $top = $self->topName; return +{ $top => $self->_toHash($top, $rel, [$top]) }; } sub _toHash { my ($self, $term, $rel, $stack) = @_; my $h = $self->depth_first($term, 1, @$rel); if (keys %$h) { for (keys %$h) { $h->{$_} = $self->_toHash($_, $rel, [@$stack, $_]); } } else { $h = join("::", @$stack); } return $h; } ## # # sub toJson { my ($self, $rel) = @_; $rel //= "NT"; $rel = [$rel] unless ref($rel); my $top = $self->topName; $self->_toJson($top, $rel); } sub _toJson { my ($self, $term, $rel) = @_; my $h = $self->depth_first($term, 1, @$rel); my $json = "{ \"data\": \"$term\", \"attr\":{id:\"$term\"}"; if (keys %$h) { $json .= ", \"children\": ["; $json .= join(", ", map { $self->_toJson($_, $rel) } keys %$h); $json .= "]" } $json .= "}"; } ### # # sub _tc_aux { my ($self,$term,$vis,@relat) = @_; $term = $self->getdefinition($term); my %r = ( $term => 1 ); for ($self->terms($term,@relat)) { next if exists $vis->{$_}; $vis->{$_}++; %r = (%r, $_ => 1, _tc_aux($self,$_,@relat)) unless $r{$_}; } return %r; } ### # # sub _term_normalize { my $t = shift; $t =~ s/^\s*(.*?)\s*$/$1/; $t =~ s/\s\s+/ /g; return $t; } sub capitalize { my $op = shift; my $text = shift; if ($op) { $text = join(" ",map {ucfirst} split /\s+/, $text); } return $text; } # remove duplicados de uma lista sub _set_of { my %set = (); $set{$_} = 1 for @_; return keys %set; } 1; __END__ =encoding UTF-8 =head1 NAME Biblio::Thesaurus - Perl extension for managing ISO thesaurus =head1 SYNOPSIS use Biblio::Thesaurus; $obj = thesaurusNew(); $obj = thesaurusLoad('iso-file'); $obj = thesaurusRetrieve('storable-file'); $obj = thesaurusMultiLoad('iso-file1','iso-file2',...); $obj->save('iso-file'); $obj->storeOn('storable-file'); $obj->addTerm('term'); $obj->addRelation('term','relation','term1',...,'termn'); $obj->deleteTerm('term'); $obj->isDefined('term'); $obj->describe( { rel='NT', desc="Narrow Term", lang=>"UK" } ); $obj->addInverse('Relation1','Relation2'); $obj->order('rela1', 'rel2', ....); @order = $obj->order(); $obj->languages('l1', 'l2', ....); @langs = $obj->languages(); $obj->baselang('l'); $lang = $obj->baselang(); $obj->topName('term'); $term = $obj->topName(); $html = $obj->navigate(+{configuration},%parameters); $html = $obj->getHTMLTop(); $output = $obj->downtr(\%handler); $output = $obj->downtr(\%handler,'term', ... ); $obj->appendThesaurus("iso-file"); $obj->appendThesaurus($tobj); $obj->tc('term', 'relation1', 'relation2'); $obj->depth_first('term', 2, "NT", "UF") $latex = $obj->toTex( ...) $xml = $obj->toXml( ...) =head1 ABSTRACT This module provides transparent methods to maintain Thesaurus files. The module uses a subset from ISO 2788 which defines some standard features to be found on thesaurus files. The module also supports multilingual thesaurus and some extensions to the ISOs standard. =head1 DESCRIPTION A Thesaurus is a classification structure. We can see it as a graph where nodes are terms and the vertices are relations between terms. This module provides transparent methods to maintain Thesaurus files. The module uses a subset from ISO 2788 which defines some standard features to be found on thesaurus files. This ISO includes a set of relations that can be seen as standard but, this program can use user defined ones. So, it can be used on ISO or not ISO thesaurus files. =head1 File Structure Thesaurus used with this module are standard ASCII documents. This file can contain processing instructions, comments or term definitions. The instructions area is used to define new relations and mathematical properties between them. We can see the file with this structure: ______________ | | | HEADER | --> Can contain, only, processing instructions, |______________| comment or empty lines. | | | Def Term 1 | --> Each term definition should be separated | | from each other with an empty line. | Def Term 2 | | | | ..... | | | | Def Term n | |______________| Comments can appear on any line. Meanwhile, the comment character (B<#>) should be the first character on the line (with no spaces before). Comments line span to the end of the line (until the first carriage return). Processing instructions lines, like comments, should start with the percent sign (B<%>). We describe these instructions later on this document. Terms definitions can't contain any empty line because they are used to separate definitions from each other. On the first line of term definition record should appear the defined term. Next lines defines relations with other terms. The first characters should be an abbreviation of the relation (on upper case) and spaces. Then, should appear a comma separated list of terms. There can be more than one line with the same relation. Thesaurus module will concatenate the lists. If you want to continue a list on the next line you can repeat the relation term of leave some spaces between the start of the line and the terms list. Here is an example: Animal NT cat, dog, cow fish, ant NT camel BT Life being cat BT Animal SN domestic animal to be kicked when anything bad occurs. There can be defined a special term (C<_top_>). It should be used when you want a top tree for thesaurus navigation. So, we normally define the C<_top_> term with the more interesting terms to be navigated. The B subset used are: =over 4 =item B - Top Term The broadest term we can define about the current term. =item B - Narrower Term Terms more specific than current term. =item B - Broader Term More generic terms than current term. =item B - Synonym Another chances when finding a Synonym. =item B - Quasi-Synonym Terms that are no synonyms of current term but can be used, sometimes with that meaning. =item B - Related Term Related term that can't be inserted on any other category. =item B - Scope Note Text. Note of context of the current term. Use for definitions or comments about the scope you are using that term. =back =head2 Processing Instructions Processing instructions, as said before, are written on a line starting with the percent sign. Current commands are: =over 4 =item B When presenting a thesaurus, we need a term, to know where to start. Normally, we want the thesaurus to have some kind of top level, where to start navigating. This command specifies that term, the term that should be used when no term is specified. Example: %top Contents Contents NT Biography ... RT ... =item Boding This command defines the encoding used in the thesaurus file. Example: %enc utf8 =item Berse This command defines the mathematic inverse of the relation. That is, if you define C and you know that C is related by C with C, then, C is related by C with C. Example: %inv BT NT %inverse UF USE =item Bription This command defines a description for some relation class. These descriptions are used when outputting thesaurus on HTML. Example: %desc SN Note of Scope %description IOF Instance of If you are constructing a multilingual thesaurus, you will want to translate the relation class description. To do this, you should use the C command with the language in from of it: %desc[PT] SN Nota de Contexto %description[PT] IOF Instancia de =item Bernals This defines classes that does not relate terms but, instead, relate a term with some text (a scope note, an url, etc.). This can be used like this: %ext SN URL %externals SN URL Note that you can specify more than one relation type per line. =item Buages This other command permits the construction of a multilingual thesaurus. TO specify languages classifiers (like PT, EN, FR, and so on) you can use one of these lines: %lang PT EN FR %languages PT EN FR To describe (legend) the language names, you should use the B command, so, you could append: %description PT Portuguese %description EN English %description FR French =item Buage This one makes it possible to explicitly name the base language for the thesaurus. This command should be used with the C one, to describe the language name. Here is a simple example: %baselang PT %languages EN FR %description PT Portuguese %description EN English %description FR French =back =head2 I18N Internationalization functions, C and C should be used before any other function or constructor. Note that when loading a saved thesaurus, descriptions defined on that file will be not translated. That's important! interfaceLanguages() This function returns a list of languages that can be used on the current Thesaurus version. interfaceSetLanguage( ) This function turns on the language specified. So, it is the first function you should call when using this module. By default, it uses Portuguese. Future version can change this, so you should call it any way. =head1 API This module uses a perl object oriented model programming, so you must create an object with one of the C, C or C commands. Next commands should be called using the B fashion. =head1 Constructors =head2 thesaurusNew To create an empty thesaurus object. The returned newly created object contains the inversion properties from the ISO classes and some stub descriptions for the same classes. =head2 thesaurusLoad To use the C function, you must supply a file name. This file name should correspond to the ISO ASCII file as defined on earlier sections. It returns the object with the contents of the file. If the file does not defined relations and descriptions about the ISO classes, they are added. Also, $obj = thesaurusLoad({ completed => 1}, 'iso-file'); can be used to say that the thesaurus needs B to be complete after load. =head2 thesaurusMultiLoad You can join different thesaurus ISO files using this function: $obj = thesaurusMultiLoad('iso-file1','iso-file2',...); =head2 appendThesaurus You can also append a thesaurus ISO (or another thesaurus object) to a loaded thesaurus. For that, use one of: $obj->appendThesaurus("iso-file"); $obj->appendThesaurus( $other_thesaurus_object ); =head2 thesaurusLoadM This method is used to load a thesaurus on the meta-thesaurus format. This is still under development. =head2 thesaurusRetrieve Everybody knows that text access and parsing of files is not efficient. So, this module can save and load thesaurus from Storable files. This function should receive a file name from a file which was saved using the C function. =head1 Methods =head2 save This method dumps the object on an ISO ASCII file. Note that the sequence C, C is not the identity function. Comments are removed and processing instructions can be added. To use it, you should supply a file name. Note: if the process fails, this method will return 0. Any other method die when failing to save on a file. =head2 meta2str This method returns the ISO ascii description of the metadata. =head2 storeOn This method saves the thesaurus object in Storable format. You should use it when you want to load with the C function. =head2 addTerm You can add terms definitions using the perl API. This method adds a term on the thesaurus. Note that if that term already exists, all its relations will be deleted. =head2 all_terms Returns an array with all terms for the thesaurus base language. NOTE: this function is deprecated. Use allTerms instead. =head2 allTerms Returns an array with all terms for the thesaurus base language. =head2 topName Returns the term in the top of the thesaurus, or defined a new one if called with an argument. =head2 top_name Deprecated. See C<>; =head2 addRelation To add relations to a term, use this method. It can be called again and again. Previous inserted relations will not be deleted. This method can be used with a list of terms for the relation like: $obj->addRelation('Animal','NT','cat','dog','cow','camel'); Note: After you add a big amount of relations, autocomplete the thesaurus using the $obj->complete() method. Completing after each relation addiction is time and cpu consuming. =head2 hasRelation Checks if a specific relation exists in the Thesaurus: if ($obj->hasRelation('Animal','NT','cat')) { ... } You can check if a term has a relation "X" with anything: if ($obj->hasRelation('Animal','SN')) { ... } =head2 deleteRelation $obj->deleteRelation('Animal','NT','cat','dog','cow','camel'); =head2 deleteTerm Use this method to remove all references of the term supplied. Note that B references will be deleted. =head2 describe You can use this method to describe some relation class. You can use it to change the description of an existing class (like the ISO ones) or to define a new class. =head2 isDefined Use this method to check if a term exists in the thesaurus. =head2 setExternal Use this method to define that a relation is "extern". =head2 isExternal Use this method to check if a relation is "extern". =head2 isLanguage Use this method to check if a relation is a Language. =head2 getdefinition Deprecated. Use C< =head2 getDefinition Returns the definition for a term. The definition is a feature structure containing the term information. =head2 getDescription Given a relation name and a language (or the default will be used), it returns the description for that relation. =head2 relations Call this method with a term, and it returns a list of the relations defined for that term. =head2 addInverse This method should be used to describe the inversion property to relation classes. Note that if there is some previous property about any of the relations, it will de deleted. If any of the relations does not exist, it will be added. =head2 order With this method you can define (and access) the order of classes. This order is used whenever you call a dump function, or the navigation CGI. =head2 navigate This function is a some what type of CGI included on a object method. You must supply an associative array of CGI parameters. This method prints an HTML thesaurus for Web Navigation. The typical thesaurus navigation CGI is: #!/usr/bin/perl -w use CGI qw/:standard/; use Biblio::Thesaurus; print header; for (param()) { $arg{$_} = param($_) } $thesaurus = thesaurusLoad("thesaurus_file"); print $thesaurus->navigate(%arg); This method can receive, as first argument, a reference to an associative array with some configuration variables like what relations to be expanded and what language to be used by default. So, in the last example we could write $thesaurus->navigate(+{expand=>['NT', 'USE'], lang =>'EN'}) meaning that the structure should show two levels of 'NT' and 'USE' relations, and that it should use the English language. These options include: =over 4 =item capitalize try to capitalize terms when they are the title of the page. =item expand a reference to a list of relations that should be expanded at first level; Defaults to the empty list. =item title can be C or C. If it is C, the current term will not be shown as a title; Defaults to C. =item scriptname the name of the script the links should point on. Defaults to current page name. =item level1hide a reference to a list of relations to do not show on the first level. Defaults to the empty list. Useful to hide the 'LEN' relation when using Library::Simple. =item level2size the number of terms to be shown on each second level relation; Defaults to 0 (all terms). =item level2hide a reference to a list of relations to do not show on the second level. Defaults to the empty list. =item topic_name the name of the topic CGI parameter (default: "t") =back =head2 dumpHTML This method returns a big string containing all the thesaurus in HTML. It is mainly used for debug. =head2 getHTMLTop This method returns the HTML needed for the top level of the browsing thesaurus. It can be useful when putting a top level on the first page of a portal. =head2 complete This function completes the thesaurus based on the invertibility properties. This operation is only needed when adding terms and relations by this API. Whenever the system loads a thesaurus ISO file, it is completed. =head2 baselang Use this method to set or retrieve the base language of the thesaurus. If no baselang is provided, the value "_" is returned =head2 downtr The C method is used to produce something from a set of terms. When no term is given, the all thesaurus is taken. It should be passed as argument a term and an associative array (handler) with anonymous subroutines that process each relation. These functions can use the pre-instantiated variables C<$term>, C<$rel>, C<@terms>. The handler can have three special functions: C<-default> (default handler for relations that don't have a defined function in the handler), C<-eachTerm> executed with each term output (received as C<$_>), and C<-end> executed over the output of the the other functions (received as C<$_>), If a C<-order> array reference is provided, the correspondent order of the relations will be used. Example: $the->downtr( { NT => sub{ ""}, #Do nothing with NT relations -default => sub{ print "$rel", join(",",@terms) } }, "frog" ); print $thesaurus->downtr( {-default => sub { "\n$rel \t".join("\n\t",@terms)}, -eachTerm => sub { "\n______________ $term $_"}, -end => sub { "Thesaurus :\n $_ \nFIM\n"}, -order => ["BT","NT","RT"], }); Both functions return a output value: the concatenation of the internal values (but functions can also work with side effects) =head2 depth_first The C method is used to get the list of terms (in fact the tree of terms) related with C<$term> by relations C<@r> up to the level C<$lev> $hashref = $the->depth_first($term ,$lev, @r) $hashref = $the->depth_first("frog", 2, "NT","UF") C<$lev> should be an integer greater then 0. =head2 tc transitive closure The C method is used to eval the transitive closure of the relations C<@r> starting from a term C<$term> $the->tc($term , @r) $the->tc("frog", "NT","UF") =head2 terms The C method is used to get all the terms related by relations C<@r> with C<$term> $the->terms($term , @r); $the->terms("frog", "NT", "UF"); =head2 toTex Writes a thesaurus in LaTeX format... The first argument is used to pass a tag substitution hash. It uses downtr function to make the translation; a downtr handler can be given to tune some transformations details... print $thesaurus->toTex( {EN=>["\\\\\\emph{Ingles} -- ",""]}, {FR => sub{""}}) =head2 toXml This method writes a thesaurus in XML format... The first argument is used fo pass a tag substitution hash. It uses downtr function to make the translation; a downtr handler can be given to tune some transformations details... print $thesaurus->toXml(); =head2 toJson Returns a JSON tree based on NT relation. Other relation can be supplied as an argument. Future versions might include language selection. print $thesaurus->toJson(); =head2 toHash Returns a Hash reference with a tree based on NT relation. Other relation can be supplied as an argument. Future versions might include language selection. print $thesaurus->toHash(); =head1 AUTHOR Alberto Simoes, José Joao Almeida, Sara Correia, This module is included in the Natura project. You can visit it at http://natura.di.uminho.pt, and access the SVN tree. =head1 COPYRIGHT & LICENSE Copyright 2000-2012 Project Natura. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 SEE ALSO The example thesaurus file (C), Manpages: Biblio::WebPortal(3) Biblio::Catalog(3) Biblio::Catalog::Bibtex(3) perl(1) manpages. =cut __DATA__ =head2 loading from Iso 2788 =head2 building a thesaurus with internal constructors =head2 writing a thesaurus in another format Biblio-Thesaurus-0.43/examples/animal.the000644 000765 000024 00000000577 11442435506 020504 0ustar00ambsstaff000000 000000 %enc ISO_8859_1 %baselang PT %languages EN %languages FR %desc[EN] NT Narrower Term %desc[EN] BT Broader Term %desc NT Termo Específico %desc BT Termo Genérico %desc HAS Instâncias %desc IOF instância de %inverse IOF HAS %top animal animal NT mamífero, doméstico, réptil EN animal mamífero HAS gato, leão leão EN lion gato EN cat FR chat BT doméstico sapo IOF réptil EN frog Biblio-Thesaurus-0.43/examples/animals1.iso000644 000765 000024 00000001216 11531755571 020757 0ustar00ambsstaff000000 000000 %externals SN URL %language EN %baselanguage EN %inverse NT BT %inverse RT RT %inverse USE UF %inverse UF USE %inverse BT NT %description NT Narrower term %description SN Scope note %description RT Related term %description USE Synonym %description TT Top term %description UF Quasi synonym %description IS-A ... %description BT Broader term Animal BT Carnivora BT Felidae BT Panthera Carnivora BT Felidae BT Canidae Panthera NT Felidae BT lion BT tiger lion NT Panthera Felidae NT Carnivora BT Panthera BT Felis tiger NT Panthera house_cat NT Felis Lucky IS-A house_cat Felis NT Felidae BT house_cat Canidae NT Carnivora Biblio-Thesaurus-0.43/examples/animals2.iso000644 000765 000024 00000000727 11531755640 020763 0ustar00ambsstaff000000 000000 %externals SN URL %language EN %baselanguage EN %inverse NT BT %inverse RT RT %inverse USE UF %inverse UF USE %inverse BT NT %description NT Narrower term %description SN Scope note %description RT Related term %description USE Synonym %description TT Top term %description UF Quasi synonym %description IS-A ... %description BT Broader term lion Carnivora BT Felis house_cat NT Felis Lucky IS-A lion Felis NT Carnivora BT house_cat the_lion_king IS-A lion Biblio-Thesaurus-0.43/examples/biling2thesaurus000755 000765 000024 00000003313 10730327653 021752 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s our ($bl,$l1,$l2,$sep,$rel,$norel); $sep ||= ":"; $rel ||= "BT" unless $norel; die("usage: $0 [-l1=EN] [-l2=PT] [-rel=IOF] [-sep=:] [-norel] file\n") unless $l1 || $l2; # $l1 = 1 unless $l2; print "%baselang $bl\n%lang $bl $l1 $l2\n\n" if $bl; while(<>){ if(/^[%]enc(?:oding)?\s+(\S+)/){ print $_; } elsif(/(.+)$sep(.+)$sep(.+)/){ if($l1){ print "$2\n$l1 $1\n", ($norel ? "": "$rel $3\n" ), "\n" ; } else { print "$1\n$l2 $2\n", ($norel ? "": "$rel $3\n" ), "\n" ; } } elsif(/(.+)$sep(.+)/){ if($l1){ print "$2\n$l1 $1\n\n" ; } else { print "$1\n$l2 $2\n\n" ; } } elsif(/^\s*$/){ print "\n";} } __END__ =head1 NAME biling2thesaurus - converts a ":"-separated bilingual dictionary to ISO thesaurus =head1 SYNOPSIS biling2thesaurus [-l1=EN|-l2=FR] bilingDict > thesaurus =head1 DESCRIPTION Converts a Bilingue dictionary into a ISO-thesaurus. Input format: lines containing term1 : term2 : class or term1: term2 =head2 Input Format :: :: Input files may have a encoding declaration. Example: %encoding UTF-8 =head2 Output Format LANG2 BT LANG2 BT =head1 Options -l1=FR baselang is lang2; lang1 is french -l2=FR baselang is lang1; lang2 is french -rel=IOF Relation tag is IOF (default BT) -norel Ignore classes in the inputfile (default: show relations) -sep="::" field separator is "::" (default ":") -bl=PT baselang is PT (makes a %baselang PT in output file) =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/examples/ex.tax000644 000765 000024 00000000533 11524567333 017670 0ustar00ambsstaff000000 000000 %enc utf8 Geografia Política América América Central Belize Costa Rica El Salvador Guatemala Honduras Nicarágua Panamá América Do Norte Canadá Alberta México Biblio-Thesaurus-0.43/examples/ex1.pl000644 000765 000024 00000000743 10730327653 017570 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w use lib qw{ .. ../.. ../../.. }; use Biblio::Thesaurus; use Data::Dumper; print $INC{ "Thesaurus.pm"} ; $thesaurus = thesaurusLoad('animal.the'); print Dumper($thesaurus->depth_first("animal",1,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",1,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",2,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",3,"NT","USE","HAS")); #print Dumper($thesaurus->jjdt({},"_top_")); Biblio-Thesaurus-0.43/examples/ex2.pl000644 000765 000024 00000002361 10730327653 017567 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s #use lib qw{ .. ../.. ../../.. }; use Biblio::Thesaurus; use Data::Dumper; our ($t); unless($t){ die ("usage: ex2.pl t=1..7\n");} $thesaurus = thesaurusLoad('animal.the'); if($t==1){ print Dumper($thesaurus->depth_first("animal",1,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",1,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",2,"NT","USE","HAS")); print Dumper($thesaurus->depth_first("_top_",3,"NT","USE","HAS"));} elsif($t ==2){ print $thesaurus->downtr( {-default => sub { "\n$rel \t".join("\n\t",@terms)}, -eachTerm => sub { "\n______________($term)______$_"}, -end => sub { "Thesaurus :\n $_ \nFIM\n"} }); } elsif($t ==3){ print $thesaurus->downtr( {-default => sub { "\n$rel \t".join("\n\t",@terms)}, -eachTerm => sub { "\nPT\tterm$_\n"}, -end => sub { "Thesaurus :\n $_ \nFIM\n"}, -order => ["EN","FR","BT"], }, "gato", "animal","sapo"); } elsif($t ==4){ print $thesaurus->downtr( {-eachTerm => sub { "\n\n$term$_"} }); } elsif($t ==5){ print $thesaurus->toTex(); } elsif($t ==6){ print $thesaurus->toTex({EN=>["\\\\\\emph{Inglês} -- ",""]},{FR => sub{""}}); } elsif($t ==7){ print $thesaurus->toXml(); } Biblio-Thesaurus-0.43/examples/ex3.pl000644 000765 000024 00000001162 10730327653 017566 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s use Biblio::Thesaurus; use CGI qw(:all); my $the = shift || "animal.the"; $thesaurus = thesaurusLoad($the); print $thesaurus->downtr( {-default => sub { dt($thesaurus->describe($rel))."\n". join("\n", (map {dd(a({href=>"#$_"},$_))} sort @terms))}, -eachTerm => sub { dt(a({name=>"$term"},$term))."\n".dd(dl($_))."\n"}, -end => sub { h1("Thesaurus - all in one page").dl($_)."n"}, -order => ["EN","FR","BT"], URL => sub { dt($thesaurus->describe($rel))."\n". join("\n", (map {dd(a({href=>"$_"},$_))} @terms))}, }); = converte to HTML Biblio-Thesaurus-0.43/examples/ex4.pl000644 000765 000024 00000002777 10730327653 017604 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s use Biblio::Thesaurus; use CGI qw(:all); my $the = shift || "secondorder.the"; my $t = thesaurusLoad($the); my @ts=(); my @r=qw(_baselang_ _external_ _top_ _language_ _relation_ _order_); my %r; @r{@r}=@r; if(@ts=$t->terms("_order_","NT")) { $t->order(@ts); @r{@ts}=@ts } if(@ts=$t->terms("_external_","NT")){ $t->setExternal(@ts); @r{@ts}=@ts } if(@ts=$t->terms("_top_","NT")) { $t->top_name($ts[0]); } if(@ts=$t->terms("_baselang_","NT")){ $t->baselang($ts[0]); @r{@ts}=@ts } if(@ts=$t->terms("_language_","NT")){ $t->languages(@ts); @r{@ts}=@ts } # for each new relation describe it, add Invers and remove it as Term if(@ts=$t->terms("_relation_","NT")){ $t->downtr( { SN => sub{ $t->describe($term,$terms[0]) }, INV => sub{ $t->addInverse($term,$terms[0])}, -order => ["SN","INV"], -eachTerm => sub{ $r{$term}=$term }, }, @ts); } for (keys %r){$t->deleteTerm($_)} ##Show this as HTML print $t->downtr( {-default => sub { dt($t->describe($rel))."\n". join("\n", (map {dd(a({href=>"#$_"},$_))} sort @terms))}, -eachTerm => sub { dt(a({name=>"$term"},$term))."\n".dd(dl($_))."\n"}, -end => sub { h1("Thesaurus - all in one page").dl($_)."\n"}, -order => (defined $t->{order} ? [$t->order()] : ["EN","FR","BT"]), URL => sub { dt($t->describe($rel))."\n". join("\n", (map {dd(a({href=>"$_"},$_))} @terms))}, }); =head1 NAME ex4.pl - exemplo usando metadata =SYNOPIS Biblio-Thesaurus-0.43/examples/ex5.pl000644 000765 000024 00000001374 10730327653 017575 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s use Biblio::Thesaurus; use CGI qw(:all); use Data::Dumper; my $the = shift || "secondorder.the"; my $t = thesaurusLoadM($the); print Dumper ($t); print $t->baselang(), join("=",$t->order()); print $t->downtr( {-default => sub { dt($t->describe($rel))."\n". join("\n", (map {dd(a({href=>"#$_"},$_))} sort @terms))}, -eachTerm => sub { dt(a({name=>"$term"},$term))."\n".dd(dl($_))."\n"}, -end => sub { h1("Thesaurus - all in one page").dl($_)."\n"}, -order => (defined $t->{order} ? [$t->order()] : ["EN","FR","BT"]), URL => sub { dt($t->describe($rel))."\n". join("\n", (map {dd(a({href=>"$_"},$_))} @terms))}, }); =head1 NAME ex4.pl - exemplo usando metadata =SYNOPIS Biblio-Thesaurus-0.43/examples/example.pl000644 000765 000024 00000000244 10730327653 020522 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w use Biblio::Thesaurus; use Data::Dumper; $thesaurus = thesaurusLoad('thesaurus'); print Dumper($thesaurus->depth_first("_top_",3,"NT","BT")); Biblio-Thesaurus-0.43/examples/secondorder.the000644 000765 000024 00000010031 10730327653 021536 0ustar00ambsstaff000000 000000 #Meta definitions _baselang_ NT PT _order_ NT SN, NT, INST, HAS, BY, MAKES, URL, POF, IOF, PT, EN, FR, BT _language_ NT EN, FR _top_ NT Alfarrábio _relation_ NT INST, HAS, BY, MAKES, URL, POF, IOF _external_ NT URL INST SN Intance of INV IOF HAS INV POF POF SN Part of BY INV MAKES SN Made by MAKES SN he makes #Fim de Meta Alfarrábio NT elemento urbano NT local NT pessoa NT imagem NT arquivo NT música NT associações e instituições NT texto desporto e lazer NT orientação NT trilhos NT turismo turismo NT quintas e solares arquivo NT enciclopédia, NT paginaCooperante, revista, agenda cultural HAS fundo INST Alfarrábio, Vercial, Geira, Arquivo de letras de música portuguesa, Cancioneiro, História de Portugal agenda cultural INST Agenda Braga Cultural escritor NT poeta matemático músico NT compositor de música clássica, agrupamento musical, compositor, cantor SN pessoa ou conjunto de pessoas directamente ligados à música RT instrumentos musicais MAKES música música NT karaoke, partitura NT instrumentos musicais, fado, mornas, romances NT música com acordes NT etno-musicologia UF canção pessoa SN pg. ou conjunto que contem informação sobre uma determinada pessoa NT artista, matemático, personagem NT jornalista, médico, industrial, político, advogado artista NT escritor, músico, cineasta, escultor, pintor personagem SN personagem histórica NT monarca, navegador rei USE monarca pintor SN inclui artistas autores de pinturas, desenhos, gravuras, aguarelas MAKES pintura, desenho, aguarela poema BY poeta texto BY escritor NT lenda, oração, poema, conto popular, biografia, monografia, livro, artigo estátua BY escultor revista UF jornal foto UF fotografia BT imagem imagem NT desenho, aguarela, foto, pintura línguas INST português, galego, chinês NT gramática português UF língua portuguesa software SN programas, script, módulos, ... NT script, module, CGI associação SN incluindo núcleos, clubes, etc UF grupo, núcleo associações e instituições NT associação, museu, biblioteca, parque, parque UF parque natural, area-protegida INST Parque Nacional da Peneda-Gerês ,Parque Natural do Alvão INST Area Protegida do Litoral de Esposende período barroco NT sec. XVIII ciência NT matemática, físico-química, botânica, zoologia, ecologia, astronomia, geologia, telecomunicões astronomia ecologia NT parque literatura BY escritor curso BT didática físico-química matemática BY matemático monografia SN texto ou conjunto de textos acerca de uma localidade casa NT café NT igreja NT convento café HAS esplanada monumento NT igreja , paço, convento, ruína igreja HAS igreja_po, rosácea janela NT janela de rótula elemento urbano NT jardim, praça, rua, monumento, pormenor arq, casa elemento natural/rural NT flor, árvore, paisagem, moinhos pormenor arq NT janela, estátua, porta, varanda, clara-bóia local NT povoação, região, cidade, vila, freguesia, concelho, distrito, província NT país cidade INST Viana do Castelo, Bragança, Porto, Braga, Vila Real, Miranda do Douro INST Guimarães, Esposende, Póvoa do Varzim, Chaves, Famalicão INST Fafe, Mirandela, Valpaços vila INST Taipas, Ponde do Lima, Ponte da Barca, Lindoso, Vila Flor INST Vila do Conde, Monção Minho IOF província HAS Viana do Castelo, Braga, Guimarães, Esposende, Fafe HAS Taipas, Ponde do Lima, Ponte da Barca, Lindoso HAS Caminha, Melgaço, Valença, Famalicão, Monção Trás-os-Montes IOF província HAS Bragança, Vila Real, Miranda do Douro, Chaves, Valpaços HAS Vila Flor, Mirandela, Mogadouro, Macedo de Cavaleiros, Moncorvo Douro Litoral IOF província HAS Porto Portugal IOF país HAS Trás-os-Montes, Douro Litoral, Minho Cabo Verde IOF país RT mornas etnografia NT seda, olaria, linho, rendas, alfaias agrícolas, etno-musicologia têxtil NT linho, seda, rendas linho URL http://natura.di.uminho.pt/~jj/linho/ história NT numismática, arqueologia HAS personagem arqueologia NT arqueosítio visita guiada SN Exemplo: passeios virtuais, etc clérigo UF padre BT pessoa Biblio-Thesaurus-0.43/examples/tabterm000755 000765 000024 00000010110 11221140164 020071 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s our ($l,$t); #use POSIX qw(locale_h); #setlocale(&POSIX::LC_ALL, "pt_PT"); #use encoding "utf8", STDIN => "latin1"; use Unicode::Collate; my $ptsort = Unicode::Collate::->new(); use Data::Dumper; use strict; my $meta={}; my $T = {}; my %desc=(); my %inv=qw( nt bt bt nt has pof pof has iof inst inst iof rt rt ); my $name=$ARGV[0]; for my $f(@ARGV){ procFile($f); } geraThe({output=>"_$name.the"}); geraTex({output=>"_$name.tex"}) if $t; geraList({output=>"_$name.list"}) if $l; sub procFile{ my $file=shift; my $f; open ($f,$file) or die ("cant open $file\n"); while(<$f>){ my @l=(); s/\s*$//; if(/^#/ or /^$/){ next} if (/^%the(?:saurus)?(.*)/) { my $a=$1; while($a !~ m/\)\s*$/){ $a .= <$f>; } geratrans($a); } ## elsif (/^%include\s*"(.*?)"/) { procFile($1); } elsif (/^%enc(?:oding)?\s*(\S+)/) { binmode($f,":encoding($1)") or warn("cant binmode ($!)\n"); } elsif (/^%desc\s+(\S+)\s+(.*)/) { $desc{$1} = $2; } elsif (/^%inv(?:erse)?\s+(\S+)\s+(\S+)/) { $inv{$1} = $2; } elsif (/^%(.*)/) { warn("????:Erro: $_\n") } else{ @l = map { [split(/\s*\|\s*/, $_ ) ] } split(/\s*:\s*/,$_); addthe($meta,@l); } } close $f; } sub addthe{ my ($m,@tup)=@_; my $t = shift(@{$tup[0]}); for(keys %{$m->{type}}){ for my $v ( @{$tup[$_]}){ add3($v,"bt", $m->{type}{$_}); } } for(0.. scalar(@{$m->{rel}})-1){ for my $v ( @{$tup[$_]}){ add3($t,$m->{rel}[$_],$v); for(keys %{$meta->{add}}){ add3($t,$_,$meta->{add}{$_}); } } } } sub add3{ my ($t1,$r,$t2)=@_; $T->{$t1}{$r }{$t2}=1; $T->{$t2}{$inv{$r}}{$t1}=1 if $inv{$r}; } sub geratrans{ my $arg=shift; $arg =~ s/^\s*\(?\s*//; $arg =~ s/\s*\)\s*$//; my $n=0; for(split(/\s*:\s*/,$arg)){ if(/(.*?)\s*<\s*(.*)/){ $meta->{type}{$n}=$2; push(@{$meta->{rel}},$1); } elsif(/(.*?)\s*=>\s*(.*)/){ $meta->{add}{$1}=$2; push(@{$meta->{rel}},$1); } else{ push(@{$meta->{rel}},$_); } $n++; } } sub geraList{ my %opt =(output => "_output.the"); if(ref($_[0]) eq "HASH") {%opt = (%opt, %{shift(@_)})}; open (F ,">$opt{output}"); binmode(F,":utf8"); for (sort( keys %$T)){ print F "$_\n"} close F } sub geraThe{ # use locale; my %opt =(output => "_output.the"); if(ref($_[0]) eq "HASH") {%opt = (%opt, %{shift(@_)})}; open (F ,">$opt{output}"); binmode(F,":utf8"); print F qq{ %enc utf8 }; for (keys %inv) { print F "%inv $_ $inv{$_}\n"; } for (keys %desc){ print F "%desc $_ $desc{$_}\n"; } for my $t ($ptsort->sort( keys %$T)){ print F "\n$t\n"; for my $r (keys %{$T->{$t}}){ for my $t2 ($ptsort->sort( keys %{$T->{$t}{$r}})){ print F "$r $t2\n"; } } } close F; } sub geraTex{ # use locale; my %opt =(output => "_output.tex"); if(ref($_[0]) eq "HASH") {%opt = (%opt, %{shift(@_)})}; open (F ,">$opt{output}"); binmode(F,":utf8"); if($opt{style} eq "agenda"){ print F q{ \documentclass[portuges,a4paper]{article} \usepackage{agbook} } } else { print F q{ \documentclass[portuges,a4paper,twocolumn]{book} } } print F q{ \usepackage{babel} \usepackage[utf8]{inputenc} %\usepackage[latin1]{inputenc} \usepackage{dict} \usepackage{t1enc} \usepackage{aeguill} \begin{document} }; print F "\\begin{dictionary}\n"; my $last=""; for my $t ($ptsort->sort( keys %$T)){ my $fl = unaccent(substr($t,0,1)); if($fl ne $last){print F "\\bigletterc{$fl}\n"; $last = $fl } print F "\n\\term{",ppttex($t),"}{"; for my $r (keys %{$T->{$t}}){ print F "\\\\\\textbf{",ppttex($r),"} "; for my $t2 ($ptsort->sort( keys %{$T->{$t}{$r}})){ print F ppttex($t2),", "; } } print F "}\n"; } print F "\\end{dictionary}\n\\end{document}"; close F; } sub ppttex{ my $a=shift; $a =~ s/([_\$\%\#\&])/\\$1/g; $a; } sub unaccent{ my $a=shift; use utf8; $a =~ y/áéíóúàèìòùâêîôûÃÉÃÓÚÂÊÎÔÛ/aeiouaeiouaeiouaeiouaeiou/; uc($a); } Biblio-Thesaurus-0.43/examples/thesaurus.english000644 000765 000024 00000010701 10730327653 022127 0ustar00ambsstaff000000 000000 %baselang EN %desc INST Instance of %desc HAS Has %desc BY Made by %desc MAKES It makes %ext URL SN _top_ NT urban element NT place NT person NT image NT archive NT music NT associations and institutions NT text sport and leisure NT orientation NT tracks NT tourism tourism NT fifth and solar ones archive NT encyclopedia, NT paginaCooperante, magazine, cultural agenda HAS deep INST Alfarrábio, Vercial, Geira, Archive of letters of Portuguese music, Cancioneiro, History of Portugal cultural agenda INST Cultural Braga Agenda writer NT poet mathematician musician NT composer of classic music, musical grouping, composer, singer SN person or set of directamente on people to music RT musical instruments MAKES music music NT karaoke, score NT musical instruments, destiny, mornas, romances NT music with chords NT etno-musicologia UF song person SN pg. or set that will count information on one definitive person NT artist, mathematician, personage NT journalist, doctor, industrial, politician, lawyer artist NT writer, musician, cineasta, sculptor, painter personage SN historical personage NT monarch, navigator king USE monarch painter SN he includes artists authors of paintings, drawings, engravings, aguarelas MAKES painting, drawing, aguarela poem BY poet text BY writer NT legend, conjunct, poem, popular story, biography, monograph, book, article statue BY sculptor magazine UF periodical photo UF photograph BT image image NT drawing, aguarela, photo, painting languages INST Portuguese, Gallego, Chinese NT grammar Portuguese UF Portuguese language software SN programs, script, modules, ... NT script, modulates, CGI association SN including nuclei, clubs, etc UF group, nucleus associations and institutions NT association, museum, library, park, park UF natural park, area-protected INST National park of the Peneda-Gerês, Natural Park of the Alvão INST Protecting Area of the Coast of Esposende baroque period NT sec. XVIII science NT mathematics, physicist-chemistry, botany, zoology, ecology, astronomy, geology, telecomunicões astronomy ecology NT park literature BY writer course BT didactics physicist-chemistry mathematics BY mathematician monograph SN text or set of texts concerning a locality house NT coffee NT church NT convent coffee HAS esplanade monument NT church, paço, convent, ruin church HAS igreja_po, rosette window NT lozenge window urban element NT garden, square, street, monument, detail arq, house natural/rural element NT flower, tree, landscape, mills detail arq NT window, statue, door, varanda, skylight place NT population, region, city, village, clientele, concelho, district, province NT country city INST Viana of the Castle, Bragança, Port, Braga, Real Village, Miranda of the Douro INST Guimarães, Esposende, Póvoa of the Varzim, Keys, Famalicão INST Fafe, Mirandela, Valpaços village INST Taipas, You put of the Rasp, Bridge of the Bark, Lindoso, Village Flower INST Village of the Conde, Monsoon Minho IOF province HAS Viana of the Castle, Braga, Guimarães, Esposende, Fafe HAS Taipas, You put of the Rasp, Bridge of the Bark, Lindoso HAS It walks, Melgaço, Valença, Famalicão, Monsoon Trás-os-Montes IOF province HAS Bragança, Real Village, Miranda of the Douro, Keys, Valpaços HAS Village Flower, Mirandela, Mogadouro, Macedo de Cavaleiros, Moncorvo Coastal Douro IOF province HAS Port Portugal IOF country HAS Trás-os-Montes, Coastal Douro, Minho Green Handle IOF country RT mornas etnografia NT agricultural silk, pottery, hemp, incomes, alfaias, etno-musicologia textile NT hemp, silk, incomes history NT numismatics, archaeology HAS personage archaeology NT arqueosítio guided visit SN Example: virtual strolls, etc clergyman UF priest BT person *********************************************************************************** WARNING: All e-mail sent to and from this address will be received or otherwise recorded by the A.G. Edwards corporate e-mail system and is subject to archival, monitoring or review by, and/or disclosure to, someone other than the recipient. ************************************************************************************ Biblio-Thesaurus-0.43/examples/thesaurus.portuguese000644 000765 000024 00000007556 10730327653 022716 0ustar00ambsstaff000000 000000 %baselang PT %desc INST Instance of %desc HAS Has %desc BY Made by %desc MAKES It makes %desc POF Part of %inv INST IOF %inv HAS POF %inv BY MAKES %ext URL SN %top Alfarrábio Alfarrábio NT elemento urbano NT local NT pessoa NT imagem NT arquivo NT música NT associações e instituições NT texto desporto e lazer NT orientação NT trilhos NT turismo turismo NT quintas e solares arquivo NT enciclopédia, NT paginaCooperante, revista, agenda cultural HAS fundo INST Alfarrábio, Vercial, Geira, Arquivo de letras de música portuguesa, Cancioneiro, História de Portugal agenda cultural INST Agenda Braga Cultural escritor NT poeta matemático músico NT compositor de música clássica, agrupamento musical, compositor, cantor SN pessoa ou conjunto de pessoas directamente ligados à música RT instrumentos musicais MAKES música música NT karaoke, partitura NT instrumentos musicais, fado, mornas, romances NT música com acordes NT etno-musicologia UF canção pessoa SN pg. ou conjunto que contem informação sobre uma determinada pessoa NT artista, matemático, personagem NT jornalista, médico, industrial, político, advogado artista NT escritor, músico, cineasta, escultor, pintor personagem SN personagem histórica NT monarca, navegador rei USE monarca pintor SN inclui artistas autores de pinturas, desenhos, gravuras, aguarelas MAKES pintura, desenho, aguarela poema BY poeta texto BY escritor NT lenda, oração, poema, conto popular, biografia, monografia, livro, artigo estátua BY escultor revista UF jornal foto UF fotografia BT imagem imagem NT desenho, aguarela, foto, pintura línguas INST português, galego, chinês NT gramática português UF língua portuguesa software SN programas, script, módulos, ... NT script, module, CGI associação SN incluindo núcleos, clubes, etc UF grupo, núcleo associações e instituições NT associação, museu, biblioteca, parque, parque UF parque natural, area-protegida INST Parque Nacional da Peneda-Gerês ,Parque Natural do Alvão INST Area Protegida do Litoral de Esposende período barroco NT sec. XVIII ciência NT matemática, físico-química, botânica, zoologia, ecologia, astronomia, geologia, telecomunicões astronomia ecologia NT parque literatura BY escritor curso BT didática físico-química matemática BY matemático monografia SN texto ou conjunto de textos acerca de uma localidade casa NT café NT igreja NT convento café HAS esplanada monumento NT igreja , paço, convento, ruína igreja HAS igreja_po, rosácea janela NT janela de rótula elemento urbano NT jardim, praça, rua, monumento, pormenor arq, casa elemento natural/rural NT flor, árvore, paisagem, moinhos pormenor arq NT janela, estátua, porta, varanda, clara-bóia local NT povoação, região, cidade, vila, freguesia, concelho, distrito, província NT país cidade INST Viana do Castelo, Bragança, Porto, Braga, Vila Real, Miranda do Douro INST Guimarães, Esposende, Póvoa do Varzim, Chaves, Famalicão INST Fafe, Mirandela, Valpaços vila INST Taipas, Ponde do Lima, Ponte da Barca, Lindoso, Vila Flor INST Vila do Conde, Monção Minho IOF província HAS Viana do Castelo, Braga, Guimarães, Esposende, Fafe HAS Taipas, Ponde do Lima, Ponte da Barca, Lindoso HAS Caminha, Melgaço, Valença, Famalicão, Monção Trás-os-Montes IOF província HAS Bragança, Vila Real, Miranda do Douro, Chaves, Valpaços HAS Vila Flor, Mirandela, Mogadouro, Macedo de Cavaleiros, Moncorvo Douro Litoral IOF província HAS Porto Portugal IOF país HAS Trás-os-Montes, Douro Litoral, Minho Cabo Verde IOF país RT mornas etnografia NT seda, olaria, linho, rendas, alfaias agrícolas, etno-musicologia têxtil NT linho, seda, rendas linho URL http://natura.di.uminho.pt/~jj/linho/ história NT numismática, arqueologia HAS personagem arqueologia NT arqueosítio visita guiada SN Exemplo: passeios virtuais, etc clérigo UF padre BT pessoa Biblio-Thesaurus-0.43/examples/thesaurus2biling000755 000765 000024 00000003420 11764424504 021752 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s eval 'exec /usr/bin/perl -s -S $0 ${1+"$@"}' if 0; # not running under some shell our ($all); my $langa = shift; my ($l1,$l2) = ($langa =~ m{(.*)\.\.(.*)}); die("Usage: $0 l1..l2 thesaurus\n") unless $l1 && $l2; $l1=uc($l1); $l2=uc($l2); my $thesaurus= shift or die("Usage: $0 l1..l2 thesaurus\n") ; die("thesaurus $thesaurus not found\n") unless -f $thesaurus; use Biblio::Thesaurus; my $obj = thesaurusLoad($thesaurus); ### open (G,">_bilingue") or die("cant create bilingue"); print "%encoding ",$obj->{encoding},"\n\n" if defined $obj->{encoding}; binmode(STDOUT, $obj->{encoding}.":") if defined $obj->{encoding}; $obj->downtr( { $l1 => sub { $p1 = $terms[0] ; }, $l2 => sub { $p2 = $terms[0] ; }, IOF => sub { $cl = join(",",@terms); }, # BT => sub { $bt = join(",",@terms); }, -eachTerm => sub { if(uc($obj->baselang()) eq $l1) {$p1 = $term; } if(uc($obj->baselang()) eq $l2) {$p2 = $term; } # $cl ||=$bt; if ($p1 || $p2){ $p1 = "[$l1-$l2 $p2]" unless $p1; $p2 = "[$l2-$l1 $p1]" unless $p2; if ($all){ print "$p1 : $p2 : $cl\n"} else { print "$p1 : $p2 : $cl\n" unless $p1 =~ /^[\[]/ || $p2 =~ /^[\[]/ ; } $p1 = $p2 = $bt = $cl = ""; } else {""} } }); __END__ =head1 NAME thesaurus2biling - converts a ISO-... thesaurus in a bilingual dictionary =head1 SYNOPSIS thesaurus2biling [-all] lang1..lang2 thesaurus > bilingDict =head1 DESCRIPTION By default the terms straring with "[" are skiped. Use "-all" option to obtain all the terms. =head2 Output Format :: :: =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/examples/thesaurus2graphviz000644 000765 000024 00000003731 11545403325 022335 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s use strict; use warnings; our ($shape,$h,$r,$of,$t); my %par=(); my %nodepar=(); my %grel=(NT=>1); ## graph relations $par{rankdir}=1 if $h; $nodepar{shape}=$shape if $shape; if($r){for(split(/,/,$r)){$grel{$_}=1}} $of||="png"; use Biblio::Thesaurus; use GraphViz; my $filename = shift; my $the = thesaurusLoad($filename); my $TERMS; my $NTS; $the->downtr( { -default => sub { ## term rel @terms return if $rel eq "SN"; $TERMS->{$term}++; if (defined $grel{$rel}) { for (@terms) { my $ttt = $_ || "??"; next if (defined $t and $t ne $term or $t ne $ttt); push @$NTS, [$term, $ttt]; } } } }); my $g = GraphViz->new(%par); my $i = 0; for (keys %$TERMS) { $i++; $g->add_node("ID$i", label => $_,%nodepar); $TERMS->{$_} = "ID$i"; } for my $ramo (@$NTS) { $g->add_edge($TERMS->{$ramo->[0]}, $TERMS->{$ramo->[1]}); } print $g->as_png if $of eq "png"; print $g->as_svg if $of eq "svg"; print $g->as_text if $of eq "text"; print $g->as_ps if $of eq "ps"; print $g->as_imap if $of eq "imap"; __END__ =head1 NAME thesaurus2graphviz - draw a biblio::thesaurus graph =head1 SYNOPSIS thesaurus2graphviz [-shape=... -h] file.the > _.png =head1 DESCRIPTION =head2 Option -h Horizontal mode (-rankdir=1 from graphviz) -shape=circle ('record', 'plaintext', 'ellipse', 'circle', 'egg', 'triangle', 'box', 'diamond', 'trapezium', 'parallelogram', 'house', 'hexagon', 'octagon') -r=INST,PART graph relation = {NT, INST, PART } default graph relation = {NT} -of=svg output format (def:png) -t=term show only "term" and his neighbours =head1 AUTHOR Albero Simoes J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/examples/thesaurusSC000755 000765 000024 00000002740 11764424461 020737 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s our ($ori,$ablt,$rblt,$protcom) ; my ($bl,$langpat,%l); $bl = $ablt if (defined $ablt) && ($ablt =~ /[A-Za-z]+/); $bl = $rblt if (defined $rblt) && ($rblt =~ /[A-Za-z]+/); $/=''; my $s; while(<>){ chomp; if(/^\%/){ if (/\%baselang(?:uage)?\s+(\S+)/) { $bl ||= $1; } while( m/\%lang(?:uage)?\s+(\S+.*)/g) { my $ls = $1; for ($ls =~ m/(\S+)/g){ $l{$_} = 1}; $langpat=join('|',keys %l)} } elsif(/^[#]/) { } elsif(/^(.*)/) { if($protcom){ while( s/^(.*?)\s*,\s*/$1 _and / ){} if($langpat){while( s/(\n(?:$langpat))(.*?)\s*,\s*/$1$2 _and / ){} } } $_ .= "\nFROM $ori" if $ori; $_ = "$bl $_" if $ablt; $_ =~ s/^$bl\s+//i if $rblt; } print "$_\n\n"; } __END__ =head1 NAME thesaurusSC - Make thesaurus changes (simple changes) =head1 SYNOPSIS thesaurusSC -ori=Natura file.the (adds "FROM Natura" to each entry) thesaurusSC -ablt[=PT] file.the (adds "baselang tag" to each entry) thesaurusSC -rblt[=PT] file.the (removes "baselang tag" from each entry) thesaurusSC -protcom file.the (protect commas "," --> "_and ") =head1 DESCRIPTION Make simple changes in Thesaurus files. 1 add origin 2 add baselang tag 3 removes baselang tag 4 protcom protect commas in term and language 5 cat (not yet implemented) =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/examples/thesaurusV000644 000765 000024 00000002100 11764424414 020620 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s our ($ori,$ablt,$rblt) ; my ($bl); $bl = $ablt if (defined $ablt) && ($ablt =~ /[A-Za-z]+/); $bl = $rblt if (defined $rblt) && ($rblt =~ /[A-Za-z]+/); $/=''; my $s; while(<>){ chomp; if (/^\%baselang(?:uage)?\s+(\S+)/) { $bl ||= $1; while(m{\%(\S+)}g){ $direct{$1}++} } elsif(/^\%(\S+)/) { while(m{\%(\S+)}g){ $direct{$1}++} } elsif(/^#/) { } elsif(/^(\S.*)/) { my $t = $1; my $r = $'; while( m{\n(\S+)(?:[ \t]+)((?:.|\n[ \t])*)}g ){ my $def=$2; $rel{$1}++; print "Error: $1($t) is empty\n" unless $def =~ /\w/; }} else { print "Error: {$_}\n"} } for (sort keys %direct){ print "\%$_\t= $direct{$_}\n"; } for (sort keys %rel){ print "$_\t= $rel{$_}\n"; } __END__ =head1 NAME thesaurusV - Make thesaurus statistics =head1 SYNOPSIS thesaurusV file.the =head1 DESCRIPTION Calculates some statistics and simple validations Useful for debugging thesaurus... =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/docs/internals.pod000644 000765 000024 00000005327 10730327653 020356 0ustar00ambsstaff000000 000000 # -*- cperl -*- =head1 NAME Biblio::Thesaurus internals =head1 ABSTRACT This POD documents the internals of the thesaurus object. =head1 DESCRIPTION The object itself is a reference to an hash table. We will call properties to each key. This way, the C property is accessible using C<$obj->{name}>. Follows a section for each property accessible in the object. =head2 version The C property is the Biblio::Thesaurus version for the module which created the object. This property value is the same of the current module version unless the object was created using the C constructor. =head2 baselang This property holds the name of the default language (relation name) for the thesaurus. If it is not defined, then this property returns an interrogation mark (C). =head2 name When showing the thesaurus without a search term, the module shows the top for the thesaurus. This special term is stored in this attribute. Unless it is defined, the default top term is C<_top_>. =head2 inverses This property is a reference to an hash table. Keys are relation identifiers, and values are the respective inverse. Note that B that each key is too a value, and vice-versa. This means than when defining C is inverse of C it should be added to the hash: ( 'A' => 'B', 'B' => 'A' ) =head2 externals To define external relations (relations not to terms) it should be added to the C property one more entry. This property is a reference to an hash table where keys are the external relation identifiers and values are ignored (use the natural 1 value). =head2 languages As in the C property, this one stores identifiers for language pseudo-relations. This property is a reference to an hash table where keys are the language identifiers and values are ignored (use the natural 1 value). =head2 descriptions This property holds a reference to an hash table which maps relation identifiers to relation descriptions. On multi-language thesaurus, descriptions in the base language are defined normally: $obj->{descriptions}{RT} = "related term"; Descriptions for different languages use a relation identifier, a space and a language identifier as in: $obj->{descriptions}{'RT PT'} = "termo relacionado"; =head2 defined This property holds a reference to a big hash table. It maps normalized terms on their non normalized version (how they appear in the ISO file). On multi-language thesaurus, it maps non-base language terms to the normalized base language terms. =head2 thesaurus This is not really the name of the property. For each language identifier the object contains a property with that name that holds the thesaurus in that specific language as it follows: =cut Biblio-Thesaurus-0.43/bin/codetax2thesaurus000644 000765 000024 00000003157 11671663061 021073 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s use strict; our($dn8_2, $dx8_1, $sufixo_p, $tab, $the); $dn8_2 //= 0; $dx8_1 //= 1; $sufixo_p //= 1; $the //= 1; my $A=qr{[a-zA-Z_]}; my $N=qr{[0-9]}; my $X=qr{[1-9]}; my $P=qr{[0-9]}; my $f1=qr{\s*$}; my $f2=qr{\b}; my $f3=qr{[ \t\-:]|$}; my $termo; while(<>){ #if($termo =~ m/^($A)($A)$f2/){ add("NT", $1) } #if($termo =~ m/^($A$A)($N$N)$f2/){ add("NT", $1) } if(/(.*)\s/){$termo=$1} else {$termo=""} print; if ($dn8_2){ if($termo =~ m/^($N{6})($N{2})$f2/){ add($termo,"BT","${1}");next } if($termo =~ m/^($N{4})($N{2})$f2/){ add($termo,"BT","${1}");next } if($termo =~ m/^($N{2})($N{2})$f2/){ add($termo,"BT","${1}");next } } if ($dx8_1){ if($termo =~ m/^($N{1})$X(0{6})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{2})$X(0{5})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{3})$X(0{4})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{4})$X(0{3})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{5})$X(0{2})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{6})$X(0{1})$f2/){ add($termo,"BT","${1}0$2");next } if($termo =~ m/^($N{7})$X$f2/) { add($termo,"BT","${1}0" );next } } if ($sufixo_p){ if($termo =~ m!^(\S+)[:._/](\S+)$f2!){ add($termo,"BT",$1);next;} } } sub add{my ($a,$b,$c)=@_; print "$b $c\n" } __END__ \d\d$/GBT/00#\d\d\00$/GBT/0000#\d\d0000$/GBT/000000 03111400-6 XX000000-Y XXX00000-Y XXXX0000-Y XXXXX000-Y XXXXXX00-Y XXXXXXX0-Y A|A AA|XX NN|X|X|X|X|X|X-Y X = digit or 0 / N = digit / P = parity digit NN|X|X|X|X|X|X-Y Z = digit or nothing / Biblio-Thesaurus-0.43/bin/tag2thesaurus000755 000765 000024 00000004140 11302332063 020175 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s eval 'exec /usr/bin/perl -w -s -S $0 ${1+"$@"}' if 0; # not running under some shell our($rs,$fs,$fss); my $bl = shift or die("usage: $0 [options] Baselang File option -fs=.. -rs=.. -fss=..\n"); my $n=1; $rs ||= ''; $rs =~ s/\\n/\n/g; $rs =~ s/\\n/\n/g; $/= $rs; $fs ||= qr{\s*\n}; $fss ||= qr{\s+}; print "%baselang $bl\n\n"; while(<>){ next unless /\w/; if(/^\%/){ print ; next } chomp; my %r=(); my $term=undef; # for $a (split(/\s*\n/,$_)){ for $a (split(/$fs\s*/,$_)){ if ($a =~ /^($bl)$fss\s*(.*)/){ if ($term and $term ne $2){warn("2 baselang terms ($term,$2)\n"); push(@{$r{"SYN-$bl"}}, $2) } else {$term=$2;} } elsif($a =~ /^(\S+)$fss\s*(.+)/){ push(@{$r{$1}}, $2) } else { warn("???: $a\n") } } $term ||="undefined term ".$n++; print "\n$term\n"; for $a (keys %r){ for ( @{$r{$a}} ){ print("$a $_\n") } } } __END__ =head1 NAME tag2thesaurus - transform a tagdictionary into a thesaurus =head1 SYNOPSIS tag2thesaurus baselang tagdictionary =head1 DESCRIPTION tag2thesaurus converts tag-format in thesaurus-format Input file should follow the following tag-format lang1 t11 lang2 t12 rel ... lang1 t21 lang2 t22 rel ... C output looks like: %baselang lang2 t12 lang1 t11 rel ... t22 lang1 t21 rel ... =head2 Options -fs='::' field separator (def \n) -rs='\n' record separatos (def "\n\n") -fss='=' fiels separator2 (def \s+) Example: if the file has the following format L1=v11 :: L2= v21 :: Ln= vn1 L1=v12 :: L2= v22 :: Ln= vn2 the command tag2thesaurus -rs='\n' -fs='::' -fss='=' L2 file would produce thesaurus : %baselang L2 v21 L1 v11 Ln vn1 v22 L1 v12 Ln vn2 =head2 Errors and Warnings Each entry must have a term in the baselanguage (otherwise, it is created one named "undef term 1"). Each entry should have just one term in the baselanguage (the first one will be taken as the term, the other became C ). =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/bin/tageditor2thesaurus000755 000765 000024 00000002122 10730327653 021417 0ustar00ambsstaff000000 000000 #!/usr/bin/perl #undef $/; #$/=''; my %ent = # em utf8... qw( aacute á Aacute à acirc â Acirc  agrave à Agrave À aring Ã¥ Aring Ã… atilde ã Atilde à auml ä Auml Ä aelig æ AElig Æ ccedil ç Ccedil Ç eacute é Eacute É ecirc ê Ecirc Ê egrave è Egrave È euml ë Euml Ë iacute í Iacute à icirc î Icirc ÃŽ igrave ì Igrave ÃŒ iuml ï Iuml à ntilde ñ Ntilde Ñ oacute ó Oacute Ó ocirc ô Ocirc Ô ograve ò Ograve Ã’ oslash ø Oslash Ø otilde õ Otilde Õ ouml ö Ouml Ö szlig ß uacute ú Uacute Ú ucirc û Ucirc Û ugrave ù Ugrave Ù uuml ü Uuml Ü yacute ý Yacute à yuml ÿ deg ° ordm º ordf ª copy © quot ' euro € laquo « raquo » amp & lt < gt > nbsp _SPACE_ ); while(<>){ next if m{//; print html2u($_); } sub html2u{ my $f = shift; for ($f){ s#\&(\w+);#$ent{$1} || $& #ge; s#\&\#(\d+);# pack("U",$1) #ge; s#\&\#x([\dA-Fa-f]+);# pack("U",hex($1)) #ge; s#_SPACE_# #ig; } $f } Biblio-Thesaurus-0.43/bin/tax2thesaurus000755 000765 000024 00000003315 11671663061 020237 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s use strict; my @tab=(-1); my @ant=("_top"); my $ind; my $inda=0; binmode(STDOUT,":utf8"); @ARGV = ("-") unless @ARGV; for my $file (@ARGV){ open(F,$file) or die("cant open $file\n"); print "%enc utf8\n\n"; while(){ chomp; if(/%enc(oding)?\s+(latin1)/){next} if(/%enc(oding)?\s+(utf8)/i) {binmode(F,":utf8");next} my $termo; next unless /\w/; if(/^([ .-]*)(\S.*)/){ $termo = $2; $ind = length($1); ## print STDERR "Deb: $_($termo/$ind)\n"; if ($ind <= $inda) { while(@tab and $ind <= $tab[-1]){ pop(@tab); pop(@ant);} } push (@tab,$ind); push (@ant,$termo); print "\n$termo\nBT ",$ant[-2] ||"?","\n#lev ", @ant -1,"\n"; $inda = $ind; } } close F; } __END__ =head1 NAME tax2thesaurus - indented taxonomy to thesaurus =head1 SYNOPSIS tax2thesaurus file.tax > file.the =head1 DESCRIPTION A top term C<_top> is added to the taxonomy. =head2 example of the taxonomy format science physics dynamics math algebra trignometry literature =head2 Correspondent output science BT _top #lev 1 physics BT science #lev 2 dynamics BT physics #lev 3 math BT science #lev 2 algebra BT math #lev 3 trignometry BT math #lev 3 literature BT _top #lev 1 =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut __END__ Geografia Política América América Central Belize Costa Rica El Salvador Guatemala Honduras Nicarágua Panamá América Do Norte Canadá Alberta Biblio-Thesaurus-0.43/bin/thesaurus2any000755 000765 000024 00000005160 11302332063 020214 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s =head1 NAME thesaurus2any - translate thesaurus like notation to xml catalogue format =head1 SYNOPSYS thesaurus2any -cat file > file.xml thesaurus2any -thes file > file.the thesaurus2any -tex file > file.tex =head1 DESCRIPTION translates a thesaurus file into XML catalogue format, or thesaurus format after completion, or to latex format. =head2 OPTIONS -cat XML catalogue format -thes thesaurus format -tex LaTeX format -multi =cut # use lib qw(/home/jj/lib/perl5/); use Biblio::Thesaurus; use vars qw($thes $cat $tex $multi); if ($multi) { $the = thesaurusMultiLoad(@ARGV); } else { my $thesaurus = shift || die ("use: thesaurus2any filename\n"); $the = thesaurusLoadM($thesaurus); } #$the->addInverse('RT' , "RT"); #$the->addInverse('POF' => "HAS"); #$the->addInverse('about' => "citedIn"); #$the->addInverse('needs' => "usedIn"); #$the->addInverse('INST' => "IOF"); #$the->addInverse('makes' => 'by'); sub t2cat{ my @rel= qw( about IOF needs usedin POF NT BT RT INST HAS makes by ); my $corres={ about => [ "rel type='about'","/rel"], POF => [ "rel type='POF'","/rel"], makes => [ "rel type='RT'","/rel"], by => [ "rel type='BT'","/rel"], SN => [ "description","/description"], }; for (@rel){$corres->{$_}=["rel type='$_'","/rel"];} $mydetails = { '-eachTerm' => sub{"\n $term\n$_" . ($nott? "": " termo thesaurus\n"). ($nott? "": " $_c\n") . "\n\n"}, '-end' => sub{"\n\n$_\n"} , }; print "\n", $the->toXml($corres,$mydetails); } $names = { # 'NT' => "termo específico", # 'BT' => "termo genérico", # 'RT' => "termo associado", # 'POF' => "parte de", # 'HAS' => "partes", # 'about' => "acerca de", # 'needs' => "necessita de", # 'usedin' => "usado em", # 'IOF' => "instância de", # 'INST' => "instâncias", # 'makes' => 'obras', # 'by' => 'autor', # 'USE' => "ver", # 'USES' => "sinónimo", # 'SN' => "Descrição", }; $auxfile="/tmp/$$.out"; sub t2the{ $the->save($auxfile) ; print `cat $auxfile` ; unlink($auxfile); } sub t2tex{ print $the->toTex( {map {$_ => ["\\\\\\emph{$names->{$_}} -- ",""]} keys %$names }, # { -order => [qw{SN NT BT RT IOF HAS}] } ) ; } if ($tex){t2tex();} elsif($cat){t2cat();} else {t2the();} Biblio-Thesaurus-0.43/bin/thesaurus2htmls000755 000765 000024 00000010043 11545403325 020561 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -s our ($dir,$nostyle); $dir ||= "."; my $thesaurus= shift or die("Usage: $0 [-dir=...] thesaurus [lang*]\n"); mkdir($dir) unless -d $dir; my $obj; use Biblio::Thesaurus; use CGI qw(:all :nodebug); for my $lang ("",@ARGV) { if($lang){ system("thesaurusTranslate $thesaurus - $lang > _$thesaurus$lang"); $thename="_$thesaurus$lang";} else { $thename=$thesaurus; } my $la="!"; $obj = thesaurusLoad($thename); open (G,">$dir/${lang}0_lista_de_termos.html") or die("cant create index"); binmode(G,":utf8"); binmode(STDERR,":utf8"); $obj->baselang('_') if ($obj->{baselang} eq "?"); print G $obj->downtr( {-default => sub { my $cla=(exists $obj->{languages}{$rel})?"lingua":"rel"; if (@terms > 1) { li({-class => $cla}, strong($obj->getDescription($rel)),"\t". ul(li([map{ a({href=>t2f($_,$cla,$rel)},$_)."\n"} sort @terms])))."\n"; } else { li({-class => $cla, '-lang' => $rel}, strong($obj->getDescription($rel)),"\t". join(", ", map{ a({href=>t2f($_,$cla,$rel)},$_)."\n"} sort @terms))."\n"; } }, -order => ["IOF", "PT","EN","FR",'SP','ES',"DE","IT","DA","NL","SV","FI" ,"EL","HU","HE" ,"BT","NT","RT","MT","UF","USE","SN"], -end => sub { ambsheader($obj->{title},$obj->{baselang}). ($obj->{title}? h2($obj->{title}):""). ($obj->{author}? h3($obj->{author}):""). "\n
    \n $_ \n
\n
". ambsfooter(); }, -eachTerm => sub { mkdir("$dir/" . uc($obj->{baselang})) unless -d ("$dir/" . uc($obj->{baselang})); my $tt=t2f($term,"lingua",$obj->{baselang},1); open(F,">$dir/$tt") or die ("cant create file $dir/$tt\n"); binmode(F,":utf8"); print F ambsheader($term,$obj->{baselang}); print F h3($term).ul($_)."\n"; print F ambsfooter(); close F; print STDERR "$term\n"; if(lett($term) ne $la){ $la = lett($term); return "\n".li(a({href=>$tt},red($term)));} else{ return "\n".li(a({href=>$tt},$term));} }, SN => sub{ li({-class=>"text"},strong("sn"),join("",@terms)) }, }); close G; } sub t2f{ my ($a,$c,$r,$f)=@_; my $dir= uc( ($c eq "lingua")? $r : $obj->{baselang}); for ($a){ s/[ºª\[\] \(\)\/\?\":]/_/g ; tr [A-ZÁÉÍÓÚÂÊÔÇÃÕÑÄËÏÖÜ] [a-záéíóúâêôçãõñäëïöü] ; s/([\x{80}-\x{ffff}])/sprintf('=%x',ord($1))/ge ; # tr [A-ZÁÉÍÓÚÂÊÔÇÃÕÑÄËÏÖÜ] # [a-zaeiouaeocaonaeiou] ; # tr/áéíóúàèìòùâêîôûäëïöüãõñç/AEIOUAEIOUAEIOUAEIOUAONC/; } ($f)?"$dir/$a.html":"../$dir/$a.html" } sub lett{ my $p=shift; $p =~ s/\W//g ; $p =~ tr [A-ZÁÉÍÓÚÂÊÔÇÃÕÑÄËÏÖÜ] [a-zaeiouaeocaonaeiou] ; $p =~ tr [áéíóúâêôçãõñäëïöü] [aeiouaeocaonaeiou] ; substr $p,0,1; } sub ambsheader{ my ($t,$l)=@_; return qq{ $t } . ($nostyle ?"" : qq{\n}). qq{ }; } sub ambsfooter{ return qq{ } } sub red{ font({color=>"red"},@_)} __END__ NumExp __END__ =head1 NAME thesaurus2htmls - generates a HTML site (one file for each term) =head1 SYNOPSIS thesaurus2htmls [-dir=D] thesaurus [lang*] =head1 DESCRIPTION For each term in each language passed as argument, creates a linked HTML page. =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). =cut Biblio-Thesaurus-0.43/bin/thesaurus2tageditor000644 000765 000024 00000000405 10730327653 021416 0ustar00ambsstaff000000 000000 #!/usr/bin/perl #undef $/; $/=''; print "
\n";
while(<>){
 chomp;
 if(/^(.*)\n/){
    print "\nPt: $1\n";
    my $a = "$'\n";
    $a =~ s/(.+)\n/\n/g;
    print "$a\n\n"
 }
}
print "
\n"; Biblio-Thesaurus-0.43/bin/thesaurus2tex000644 000765 000024 00000003303 11221167706 020231 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s our ($c); use locale; use strict; use Biblio::Thesaurus; my $the = shift; my $dest = shift || "thesaurus"; my $obj = thesaurusLoad($the); my $tit = $obj->{title} || $dest; my $aut = $obj->{author} || ""; my $desc = $obj->{desc} || ""; open(F,">$dest.tex") or die; binmode(F, ":utf8"); print F qq{ \\documentclass[twoside,portuges]{book} \\RequirePackage[a4paper,top=3cm,left=2cm,right=2cm,bottom=1.5cm,nofoot]{geometry} \\parindent 0pt %\\parskip 3pt \\usepackage{babel} %\\usepackage[latin1]{inputenc} \\usepackage[utf8]{inputenc} \\usepackage{t1enc} \\usepackage{aeguill} \\usepackage{dict} \\begin{document} \\title{$tit} \\author{$aut} \\date{\\today} \\maketitle $desc \\newpage \\twocolumn }; print F $obj->toTex({},{ ## -default => sub { "\n$rel \t".join("\n\t",@terms)}, -eachTerm => sub { $term = n($term); $_ = n($_); if(/\w/){"\n\\term{$term}{ \\begin{description} $_ \\end{description}}\n"} else {"\n\\term{$term}{}\n"}}, -end => sub { "\\begin{dictionary}\n$_\n\\end{dictionary}\n"}, ## -order => ["BT","NT","RT"], }); print F q{ \end{document} }; close(F); sub n{ my $a = shift; $a =~ s/^([#_])/\\$1/g; $a =~ s/^([\^])/*$1/g; $a =~ s/([^\\])([#_])/$1\\$2/g; $a =~ s/([^\\])([\^])/$1*/g; $a; } __END__ =head1 NAME thesaurus2tex - ISO-Thesaurus completion and LaTeX translator =head1 SYNOPSIS thesaurus2tex file.the > file.tex -c to include separators when first letters change (NOT YET) pdflatex file dicttex file pdflatex file ## to make file.pdf =head1 DESCRIPTION =head2 EXPORT =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO dicttex (http://natura.di.uminho.pt) perl(1). =cut Biblio-Thesaurus-0.43/bin/thesaurus2TMX000755 000765 000024 00000001142 10730327653 020106 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s use strict; use Biblio::Thesaurus; use XML::TMX::Writer; our ($comment); my $thesaurus = shift or die_usage(); my $the = thesaurusLoad($thesaurus); my $tmx = new XML::TMX::Writer(); $tmx->start_tmx(); my $blang = $the->baselang(); my $tu; $the->downtr( { -default => sub { $tu->{$rel} = join(", ",@terms) if $the->{languages}{$rel}; "" }, -eachTerm => sub { if ($term ne "_top_") { $tu->{$blang} = $term; $tmx->add_tu(%$tu); } $tu = {}; }, }); $tmx->end_tmx(); sub die_usage { die "Usage: $0 \n"; } Biblio-Thesaurus-0.43/bin/thesaurusTranslate000755 000765 000024 00000005361 10730327653 021320 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -s use strict; use Biblio::Thesaurus; our ($comment,$dic); my %AuxDic=(); if ($dic){ open(F,"$dic") or die("cant open auxiliar dictionary\n"); while(){ chomp; if (/^%\s*enc(oding)?\s+(\S+)/) { binmode(F,"$2:"); next } if(/(.*?)\s*:\s*([^:\n]*)/){$AuxDic{lc($1)}=$2 } } close F; } my $thesaurus = shift or die_usage(); my $langorig = shift or die_usage(); my $langdest = shift or die_usage(); my $the = thesaurusLoad($thesaurus); binmode(STDOUT,"$the->{encoding}:") if defined $the->{encoding}; $langorig = $the->{baselang} if $langorig eq "-"; my $m=$the->meta2str(); $m =~ s/(\%baselang(?:uage)?\s+)([-\w:]+)/$1$langdest/; print $m; our $termBody = ""; $the->downtr( { -order => [qw/PT FR SP ES EN DE BT NT RT MT UF USE SN/], -default => sub { if ($the->{languages}{$rel}) { ### Is a language if ($rel eq "$langdest") { # $termBody .= "$langorig $term\n" "" } else { $termBody .= "$rel ".join(", ",@terms)."\n" } } else { if ($the->{externals}{$rel}) { ### Is external $termBody .= join("", map { "$rel $_\n" } @terms); # $termBody .= "$rel $term\n" } else { if ($comment) { $termBody .= join("", map { my $trans = $the->_translateTerm($_,$langdest,\%AuxDic); my $t; if ($t = missing_translation($trans)) { "# $rel $t\n"; } else { "$rel $trans\n"; } } @terms); } else { $termBody .= join("", map {"$rel ".$the->_translateTerm($_,$langdest,\%AuxDic)."\n"} @terms); } } } }, -eachTerm => sub { my $tterm = $the->_translateTerm($term,$langdest,\%AuxDic); if ($comment) { my $t; if ($t = missing_translation($tterm)) { print "\n# $t\n$langorig $term\n$termBody"; } else { print "\n$tterm\n$langorig $term\n$termBody"; } } else { print "\n$tterm\n$langorig $term\n$termBody"; } $termBody = ""; }, }); sub die_usage { die "Usage: $0 [options] Options: -comments # missing translations become comments -dic=auxiliardic # use an auxiliar dicionary\n"; } sub missing_translation { my $t = shift; if ($t =~ m!^\[[A-Z]+-[A-Z]+:(.*)\]$!) { return $1 } else { return undef; } } __END__ =head1 NAME thesaurusTranslate - Change baselanguage of a thesaurus ISO... =head1 SYNOPSIS thesaurusTranslate [ops] thesaurus sourceLang targetLang > f.the ops: -comments # missing translations become comments -dic=auxiliardic # use an auxiliar dicionary =head1 DESCRIPTION =head2 EXPORT =head1 AUTHOR J.Joao Almeida, jj@di.uminho.pt =head1 SEE ALSO perl(1). Biblio::Thesaurus =cut