pax_global_header00006660000000000000000000000064122001514000014473gustar00rootroot0000000000000052 comment=02b38dca58a1ecd54e67c5d951b22b60815caa72 ocamlrss-2.2.2/000077500000000000000000000000001220015140000133215ustar00rootroot00000000000000ocamlrss-2.2.2/.depend000066400000000000000000000003421220015140000145600ustar00rootroot00000000000000rss.cmo : rss_types.cmo rss_io.cmo rss.cmi rss.cmx : rss_types.cmx rss_io.cmx rss.cmi rss_io.cmo : rss_types.cmo rss_io.cmx : rss_types.cmx rss_types.cmo : rss_types.cmx : rsstest.cmo : rss.cmi rsstest.cmx : rss.cmx rss.cmi : ocamlrss-2.2.2/Changes000066400000000000000000000020441220015140000146140ustar00rootroot00000000000000Release 2.2.2: - fix: remove the beginning and ending spaces in contents of xml nodes, but do not collapse spaces inside PCData. (by C. Troestler) Release 2.2.1: - fix: do not set default namespace in final rss - fix: rdf parsing - fix: Rss.sort_items_by_date sorts with old items last, as documented Release 2.2.0: - add: Rss.compare_item - fix: provide polymorphic functions on channel_t and on item_t - add: Rss.channel_t_of_xmls - add: Rss.xml_of_source - fix #5: incorrect sort Release 2.1.0: - handle namespaces (the list is stored in the channel structure) - allow the library user to read addition information from and prefixed subnodes, as the RSS 2.0 specification indicates this is the way to extend RSS information - handle cloud, skipHours, skipDays, picsRating in Channel - use Neturl.url instead of sting - use Netdate instead of Rss_date, which was a copy of Netdate - default encoding is now UTF-8 - new parameter "indent" for printing - new "opts" parameter when reading - reading returns error list besides channel ocamlrss-2.2.2/LICENSE000066400000000000000000000167431220015140000143410ustar00rootroot00000000000000 GNU LESSER GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. This version of the GNU Lesser General Public License incorporates the terms and conditions of version 3 of the GNU General Public License, supplemented by the additional permissions listed below. 0. Additional Definitions. As used herein, "this License" refers to version 3 of the GNU Lesser General Public License, and the "GNU GPL" refers to version 3 of the GNU General Public License. "The Library" refers to a covered work governed by this License, other than an Application or a Combined Work as defined below. An "Application" is any work that makes use of an interface provided by the Library, but which is not otherwise based on the Library. Defining a subclass of a class defined by the Library is deemed a mode of using an interface provided by the Library. A "Combined Work" is a work produced by combining or linking an Application with the Library. The particular version of the Library with which the Combined Work was made is also called the "Linked Version". The "Minimal Corresponding Source" for a Combined Work means the Corresponding Source for the Combined Work, excluding any source code for portions of the Combined Work that, considered in isolation, are based on the Application, and not on the Linked Version. The "Corresponding Application Code" for a Combined Work means the object code and/or source code for the Application, including any data and utility programs needed for reproducing the Combined Work from the Application, but excluding the System Libraries of the Combined Work. 1. Exception to Section 3 of the GNU GPL. You may convey a covered work under sections 3 and 4 of this License without being bound by section 3 of the GNU GPL. 2. Conveying Modified Versions. If you modify a copy of the Library, and, in your modifications, a facility refers to a function or data to be supplied by an Application that uses the facility (other than as an argument passed when the facility is invoked), then you may convey a copy of the modified version: a) under this License, provided that you make a good faith effort to ensure that, in the event an Application does not supply the function or data, the facility still operates, and performs whatever part of its purpose remains meaningful, or b) under the GNU GPL, with none of the additional permissions of this License applicable to that copy. 3. Object Code Incorporating Material from Library Header Files. The object code form of an Application may incorporate material from a header file that is part of the Library. You may convey such object code under terms of your choice, provided that, if the incorporated material is not limited to numerical parameters, data structure layouts and accessors, or small macros, inline functions and templates (ten or fewer lines in length), you do both of the following: a) Give prominent notice with each copy of the object code that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the object code with a copy of the GNU GPL and this license document. 4. Combined Works. You may convey a Combined Work under terms of your choice that, taken together, effectively do not restrict modification of the portions of the Library contained in the Combined Work and reverse engineering for debugging such modifications, if you also do each of the following: a) Give prominent notice with each copy of the Combined Work that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the Combined Work with a copy of the GNU GPL and this license document. c) For a Combined Work that displays copyright notices during execution, include the copyright notice for the Library among these notices, as well as a reference directing the user to the copies of the GNU GPL and this license document. d) Do one of the following: 0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source. 1) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (a) uses at run time a copy of the Library already present on the user's computer system, and (b) will operate properly with a modified version of the Library that is interface-compatible with the Linked Version. e) Provide Installation Information, but only if you would otherwise be required to provide such information under section 6 of the GNU GPL, and only to the extent that such information is necessary to install and execute a modified version of the Combined Work produced by recombining or relinking the Application with a modified version of the Linked Version. (If you use option 4d0, the Installation Information must accompany the Minimal Corresponding Source and Corresponding Application Code. If you use option 4d1, you must provide the Installation Information in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source.) 5. Combined Libraries. You may place library facilities that are a work based on the Library side by side in a single library together with other library facilities that are not Applications and are not covered by this License, and convey such a combined library under terms of your choice, if you do both of the following: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities, conveyed under the terms of this License. b) Give prominent notice with the combined library that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 6. Revised Versions of the GNU Lesser General Public License. The Free Software Foundation may publish revised and/or new versions of the GNU Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library as you received it specifies that a certain numbered version of the GNU Lesser General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that published version or of any later version published by the Free Software Foundation. If the Library as you received it does not specify a version number of the GNU Lesser General Public License, you may choose any version of the GNU Lesser General Public License ever published by the Free Software Foundation. If the Library as you received it specifies that a proxy can decide whether future versions of the GNU Lesser General Public License shall apply, that proxy's public statement of acceptance of any version is permanent authorization for you to choose that version for the Library. ocamlrss-2.2.2/META000066400000000000000000000003331220015140000137710ustar00rootroot00000000000000version = "2.2.2" description = "Reading/writing RSS files with OCaml" requires = "xmlm,netstring" archive(byte) = "rss.cma" archive(native) = "rss.cmxa" archive(native,plugin) = "rss.cmxs" archive(toploop) = "rss.cma" ocamlrss-2.2.2/Makefile000066400000000000000000000100251220015140000147570ustar00rootroot00000000000000############################################################################### # OCamlrss # # # # Copyright (C) 2004-2013 Institut National de Recherche en Informatique # # et en Automatique. All rights reserved. # # # # This program is free software; you can redistribute it and/or modify # # it under the terms of the GNU Lesser General Public License version # # 3 as published by the Free Software Foundation. # # # # This program is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU Library General Public License for more details. # # # # You should have received a copy of the GNU Library General Public # # License along with this program; if not, write to the Free Software # # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA # # 02111-1307 USA # # # # Contact: Maxence.Guesdon@inria.fr # # # # # ############################################################################### # do not forget to update META file too VERSION=2.2.2 # do not forget to update META file too PACKAGES=xmlm,netstring OF_FLAGS=-package $(PACKAGES) OCAMLFIND=ocamlfind OCAML_COMPFLAGS= -annot OCAMLC=$(OCAMLFIND) ocamlc $(OF_FLAGS) $(OCAML_COMPFLAGS) OCAMLOPT=$(OCAMLFIND) ocamlopt $(OF_FLAGS) $(OCAML_COMFLAGS) OCAMLDOC=$(OCAMLFIND) ocamldoc $(OF_FLAGS) OCAMLDEP=ocamldep all: byte opt byte: rss.cma opt: rss.cmxa rss.cmxs CMOFILES= \ rss_types.cmo \ rss_io.cmo \ rss.cmo CMXFILES=$(CMOFILES:.cmo=.cmx) CMIFILES=$(CMOFILES:.cmo=.cmi) rss.cma: $(CMIFILES) $(CMOFILES) $(OCAMLC) -o $@ -a $(CMOFILES) rss.cmxa: $(CMIFILES) $(CMXFILES) $(OCAMLOPT) -o $@ -a $(CMXFILES) .PHONY: doc depend doc: all mkdir -p html $(OCAMLDOC) -d html -html rss.mli webdoc: doc mkdir -p ../ocamlrss-gh-pages/refdoc cp html/* ../ocamlrss-gh-pages/refdoc/ cp web/index.html web/style.css ../ocamlrss-gh-pages/ .depend depend: $(OCAMLDEP) rss*.ml rss*.mli > .depend rsstest: rss.cmxa rsstest.ml $(OCAMLOPT) -linkpkg -o $@ $(OCAML_COMPFLAGS) $^ test: rsstest @./rsstest test.rss > t.rss @./rsstest t.rss > t2.rss @((diff t.rss t2.rss && echo OK) || echo "t.rss and t2.rss differ") # installation : ################ install: $(OCAMLFIND) install rss META LICENSE $(wildcard rss.cmi rss.cma rss.cmxa rss.a rss.cmxs rss.mli rss.cmx) uninstall: ocamlfind remove rss # archive : ########### archive: git archive --prefix=ocamlrss-$(VERSION)/ HEAD | gzip > ../ocamlrss-gh-pages/ocamlrss-$(VERSION).tar.gz # Cleaning : ############ clean: -$(RM) *.cm* *.a *.annot *.o -$(RM) -r html -$(RM) rsstest t2.rss t.rss # headers : ########### HEADFILES=Makefile *.ml *.mli .PHONY: headers noheaders headers: headache -h header -c .headache_config $(HEADFILES) noheaders: headache -r -c .headache_config $(HEADFILES) # generic rules : ################# .SUFFIXES: .mli .ml .cmi .cmo .cmx .mll .mly .sch .html .mail %.cmi:%.mli $(OCAMLC) -c $(OCAML_COMPFLAGS) $< %.cmo:%.ml $(OCAMLC) -c $(OCAML_COMPFLAGS) $< %.cmi %.cmo:%.ml $(OCAMLC) -c $(OCAML_COMPFLAGS) $< %.cmx %.o:%.ml $(OCAMLOPT) -c $(OCAML_COMPFLAGS) $< %.cmxs: %.cmxa $(OCAMLOPT) -I . -shared -linkall -o $@ $< include .depend ocamlrss-2.2.2/README000066400000000000000000000001601220015140000141760ustar00rootroot00000000000000OCaml-RSS is an OCaml library to read and write RSS files. More information on http://zoggy.github.com/ocamlrss ocamlrss-2.2.2/rss.ml000066400000000000000000000200721220015140000144630ustar00rootroot00000000000000(******************************************************************************) (* OCamlrss *) (* *) (* Copyright (C) 2004-2013 Institut National de Recherche en Informatique *) (* et en Automatique. All rights reserved. *) (* *) (* This program is free software; you can redistribute it and/or modify *) (* it under the terms of the GNU Lesser General Public License version *) (* 3 as published by the Free Software Foundation. *) (* *) (* This program is distributed in the hope that it will be useful, *) (* but WITHOUT ANY WARRANTY; without even the implied warranty of *) (* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *) (* GNU Library General Public License for more details. *) (* *) (* You should have received a copy of the GNU Library General Public *) (* License along with this program; if not, write to the Free Software *) (* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA *) (* 02111-1307 USA *) (* *) (* Contact: Maxence.Guesdon@inria.fr *) (* *) (* *) (******************************************************************************) type date = Netdate.t let string_of_date ?(fmt="%d %b %Y") date = Netdate.format ~fmt date type email = string (** can be, for example: foo@bar.com (Mr Foo Bar) *) type pics_rating = string type skip_hours = int list (** 0 .. 23 *) type skip_days = int list (** 0 is Sunday, 1 is Monday, ... *) type url = Neturl.url type category = Rss_types.category = { cat_name : string ; cat_domain : url option ; } type image = Rss_types.image = { image_url : url ; image_title : string ; image_link : url ; image_height : int option ; image_width : int option ; image_desc : string option ; } type text_input = Rss_types.text_input = { ti_title : string ; (** The label of the Submit button in the text input area. *) ti_desc : string ; (** Explains the text input area. *) ti_name : string ; (** The name of the text object in the text input area. *) ti_link : url ; (** The URL of the CGI script that processes text input requests. *) } type enclosure = Rss_types.enclosure = { encl_url : url ; (** URL of the enclosure *) encl_length : int ; (** size in bytes *) encl_type : string ; (** MIME type *) } (** See {{:http://cyber.law.harvard.edu/rss/soapMeetsRss.html#rsscloudInterface} specification} *) type cloud = Rss_types.cloud = { cloud_domain : string ; cloud_port : int ; cloud_path : string ; cloud_register_procedure : string ; cloud_protocol : string ; } type guid = Rss_types.guid = Guid_permalink of url | Guid_name of string type source = Rss_types.source = { src_name : string ; src_url : url ; } type 'a item_t = 'a Rss_types.item_t = { item_title : string option ; item_link : url option ; item_desc : string option ; item_pubdate : Netdate.t option ; item_author : email option ; item_categories : category list ; item_comments : url option ; item_enclosure : enclosure option ; item_guid : guid option ; item_source : source option ; item_data : 'a option ; } type namespace = (string * string) type ('a, 'b) channel_t = ('a, 'b) Rss_types.channel_t = { ch_title : string ; ch_link : url ; ch_desc : string ; ch_language : string option ; ch_copyright : string option ; ch_managing_editor : email option ; ch_webmaster : email option ; ch_pubdate : Netdate.t option ; ch_last_build_date : Netdate.t option ; ch_categories : category list ; ch_generator : string option ; ch_cloud : cloud option ; ch_docs : url option ; ch_ttl : int option ; ch_image : image option ; ch_rating : pics_rating option ; ch_text_input : text_input option ; ch_skip_hours : skip_hours option ; ch_skip_days : skip_days option ; ch_items : 'b item_t list ; ch_data : 'a option ; ch_namespaces : namespace list ; } type item = unit item_t type channel = (unit, unit) channel_t let item ?title ?link ?desc ?pubdate ?author ?(cats=[]) ?comments ?encl ?guid ?source ?data () = { item_title = title ; item_link = link ; item_desc = desc; item_pubdate = pubdate ; item_author = author ; item_categories = cats ; item_comments = comments ; item_enclosure = encl ; item_guid = guid ; item_source = source ; item_data = data ; } let channel ~title ~link ~desc ?language ?copyright ?managing_editor ?webmaster ?pubdate ?last_build_date ?(cats=[]) ?generator ?cloud ?docs ?ttl ?image ?rating ?text_input ?skip_hours ?skip_days ?data ?(namespaces=[]) items = { ch_title = title ; ch_link = link ; ch_desc = desc ; ch_language = language ; ch_copyright = copyright ; ch_managing_editor = managing_editor ; ch_webmaster = webmaster ; ch_pubdate = pubdate ; ch_last_build_date = last_build_date ; ch_categories = cats ; ch_generator = generator ; ch_cloud = cloud ; ch_docs = docs ; ch_ttl = ttl ; ch_image = image ; ch_rating = rating ; ch_text_input = text_input ; ch_skip_hours = skip_hours ; ch_skip_days = skip_days ; ch_items = items ; ch_data = data ; ch_namespaces = namespaces ; } let compare_item = Rss_types.compare_item;; let copy_item i = { i with item_title = i.item_title };; let copy_channel c = { c with ch_items = List.map copy_item c.ch_items } ;; let sort_items_by_date l = List.sort (fun i1 i2 -> match i1.item_pubdate, i2.item_pubdate with None, None -> 0 | Some _, None -> -1 | None, Some _ -> 1 | Some d1, Some d2 -> compare (Netdate.since_epoch d2) (Netdate.since_epoch d1) ) l;; let merge_channels c1 c2 = let items = sort_items_by_date (c1.ch_items @ c2.ch_items) in let c = copy_channel c1 in { c with ch_items = items } ;; type xmltree = Rss_io.xmltree = E of Xmlm.tag * xmltree list | D of string let xml_of_source = Rss_io.xml_of_source exception Error = Rss_io.Error type ('a, 'b) opts = ('a, 'b) Rss_io.opts let make_opts = Rss_io.make_opts let default_opts = Rss_io.default_opts let channel_t_of_file = Rss_io.channel_of_file let channel_t_of_string = Rss_io.channel_of_string let channel_t_of_channel = Rss_io.channel_of_channel let channel_t_of_xmls = Rss_io.channel_of_xmls let channel_of_file = Rss_io.channel_of_file default_opts let channel_of_string = Rss_io.channel_of_string default_opts let channel_of_channel = Rss_io.channel_of_channel default_opts type 'a data_printer = 'a -> xmltree list let print_channel = Rss_io.print_channel let print_file ?channel_data_printer ?item_data_printer ?indent ?date_fmt ?encoding file ch = let oc = open_out file in let fmt = Format.formatter_of_out_channel oc in print_channel ?channel_data_printer ?item_data_printer ?indent ?date_fmt ?encoding fmt ch; Format.pp_print_flush fmt (); close_out oc let keep_n_items n channel = let rec iter acc m = function [] -> List.rev acc | i :: q when m > n -> List.rev acc | i :: q -> iter (i :: acc) (m+1) q in let c = copy_channel channel in { c with ch_items = iter [] 1 c.ch_items } ;; ocamlrss-2.2.2/rss.mli000066400000000000000000000305011220015140000146320ustar00rootroot00000000000000(******************************************************************************) (* OCamlrss *) (* *) (* Copyright (C) 2004-2013 Institut National de Recherche en Informatique *) (* et en Automatique. All rights reserved. *) (* *) (* This program is free software; you can redistribute it and/or modify *) (* it under the terms of the GNU Lesser General Public License version *) (* 3 as published by the Free Software Foundation. *) (* *) (* This program is distributed in the hope that it will be useful, *) (* but WITHOUT ANY WARRANTY; without even the implied warranty of *) (* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *) (* GNU Library General Public License for more details. *) (* *) (* You should have received a copy of the GNU Library General Public *) (* License along with this program; if not, write to the Free Software *) (* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA *) (* 02111-1307 USA *) (* *) (* Contact: Maxence.Guesdon@inria.fr *) (* *) (* *) (******************************************************************************) (** The RSS library to read and write RSS 2.0 files. Reference: {{:http://www.rssboard.org/rss-specification}RSS 2.0 specification}. *) (** {2 Types} *) type date = Netdate.t val string_of_date : ?fmt: string -> date -> string (** Format a date/time record as a string according to the format string [fmt]. @param fmt The format string. It consists of zero or more conversion specifications and ordinary characters. All ordinary characters are kept as such in the final string. A conversion specification consists of the '%' character and one other character. See [Netdate.format_to] for more details. Default: ["%d %b %Y"]. *) type email = string (** can be, for example: foo\@bar.com (Mr Foo Bar) *) type pics_rating = string type skip_hours = int list (** 0 .. 23 *) type skip_days = int list (** 0 is Sunday, 1 is Monday, ... *) type url = Neturl.url type category = { cat_name : string ; (** A forward-slash-separated string that identifies a hierarchic location in the indicated taxonomy. *) cat_domain : url option ; (** Identifies a categorization taxonomy. *) } type image = { image_url : url ; (** The URL of a GIF, JPEG or PNG image that represents the channel. *) image_title : string ; (** Description of the image, it's used in the ALT attribute of the HTML tag when the channel is rendered in HTML. *) image_link : url ; (** The URL of the site, when the channel is rendered, the image is a link to the site. (Note, in practice the [image_title] and [image_link] should have the same value as the {!channel}'s [ch_title] and [ch_link].) *) image_height : int option ; (** Height of the image, in pixels. *) image_width : int option ; (** Width of the image, in pixels. *) image_desc : string option ; (** Text to be included in the "title" attribute of the link formed around the image in the HTML rendering. *) } type text_input = { ti_title : string ; (** The label of the Submit button in the text input area. *) ti_desc : string ; (** Explains the text input area. *) ti_name : string ; (** The name of the text object in the text input area. *) ti_link : url ; (** The URL of the CGI script that processes text input requests. *) } type enclosure = { encl_url : url ; (** URL of the enclosure *) encl_length : int ; (** size in bytes *) encl_type : string ; (** MIME type *) } type guid = | Guid_permalink of url (** A permanent URL pointing to the story. *) | Guid_name of string (** A string that uniquely identifies the item. *) type source = { src_name : string ; src_url : url ; } (** See {{:http://cyber.law.harvard.edu/rss/soapMeetsRss.html#rsscloudInterface} specification} *) type cloud = { cloud_domain : string ; cloud_port : int ; cloud_path : string ; cloud_register_procedure : string ; cloud_protocol : string ; } (** An item may represent a "story". Its description is a synopsis of the story (or sometimes the full story), and the link points to the full story. *) type 'a item_t = { item_title : string option; (** Optional title *) item_link : url option; (** Optional link *) item_desc : string option; (** Optional description *) item_pubdate : date option ; (** Date of publication *) item_author : email option ; (** The email address of the author of the item. *) item_categories : category list ; (** Categories for the item. See the field {!category}. *) item_comments : url option ; (** Url of comments about this item *) item_enclosure : enclosure option ; item_guid : guid option ; (** A globally unique identifier for the item. *) item_source : source option ; item_data : 'a option ; (** Additional data, since RSS can be extended with namespace-prefixed nodes.*) } (** A namespace is a pair (name, url). *) type namespace = string * string type ('a, 'b) channel_t = { ch_title : string ; (** Mandatory. The name of the channel, for example the title of your web site. *) ch_link : url ; (** Mandatory. The URL to the HTML website corresponding to the channel. *) ch_desc : string ; (** Mandatory. A sentence describing the channel. *) ch_language : string option ; (** Language of the news, e.g. "en". See the W3C {{:http://www.w3.org/TR/REC-html40/struct/dirlang.html#langcodes} language codes}. *) ch_copyright : string option ; (** Copyright notice. *) ch_managing_editor : email option ; (** Managing editor of the news. *) ch_webmaster : email option ; (** The address of the webmasterof the site. *) ch_pubdate : date option ; (** Publication date of the channel. *) ch_last_build_date : date option ; (** When the channel content changed for the last time. *) ch_categories : category list ; (** Categories for the channel. See the field {!category}. *) ch_generator : string option ; (** The tool used to generate this channel. *) ch_cloud : cloud option ; (** Allows processes to register with a cloud to be notified of updates to the channel. *) ch_docs : url option ; (** An url to a RSS format reference. *) ch_ttl : int option ; (** Time to live, in minutes. It indicates how long a channel can be cached before refreshing from the source. *) ch_image : image option ; ch_rating : pics_rating option; (** The PICS rating for the channel. *) ch_text_input : text_input option ; ch_skip_hours : skip_hours option ; (** A hint for aggregators telling them which hours they can skip.*) ch_skip_days : skip_days option ; (** A hint for aggregators telling them which days they can skip. *) ch_items : 'b item_t list ; ch_data : 'a option ; (** Additional data, since RSS can be extended with namespace-prefixed nodes.*) ch_namespaces : namespace list ; } type item = unit item_t type channel = (unit, unit) channel_t (** {2 Building items and channels} *) val item : ?title: string -> ?link: url -> ?desc: string -> ?pubdate: date -> ?author: email -> ?cats: category list -> ?comments: url -> ?encl: enclosure -> ?guid: guid -> ?source: source -> ?data: 'a -> unit -> 'a item_t (** [item()] creates a new item with all fields set to [None]. Use the optional parameters to set fields. *) val channel : title: string -> link: url -> desc: string -> ?language: string -> ?copyright: string -> ?managing_editor: email -> ?webmaster: email -> ?pubdate: date -> ?last_build_date: date -> ?cats: category list -> ?generator: string -> ?cloud: cloud -> ?docs: url -> ?ttl: int -> ?image: image -> ?rating: pics_rating -> ?text_input: text_input -> ?skip_hours: skip_hours -> ?skip_days: skip_days -> ?data: 'a -> ?namespaces: namespace list -> 'b item_t list -> ('a, 'b) channel_t (** [channel items] creates a new channel containing [items]. Other fields are set to [None] unless the corresponding optional parameter is used. *) val compare_item : ?comp_data: ('a -> 'a -> int) -> 'a item_t -> 'a item_t -> int val copy_item : 'a item_t -> 'a item_t val copy_channel : ('a, 'b) channel_t -> ('a, 'b) channel_t (** {2 Manipulating channels} *) val keep_n_items : int -> ('a, 'b) channel_t -> ('a, 'b) channel_t (** [keep_n_items n ch] returns a copy of the channel, keeping only [n] items maximum. *) val sort_items_by_date : 'a item_t list -> 'a item_t list (** Sort items by date, older last. *) val merge_channels : ('a, 'b) channel_t -> ('a, 'b) channel_t -> ('a, 'b) channel_t (** [merge_channels c1 c2] merges the given channels in a new channel, sorting items using {!sort_items_by_date}. Channel information are copied from the first channel [c1]. *) (** {2 Reading channels} *) (** This represents XML trees. Such XML trees are given to functions provided to read additional data from RSS channels and items. *) type xmltree = E of Xmlm.tag * xmltree list | D of string (** Read an XML tree from a source. @raise Failure in case of error.*) val xml_of_source : Xmlm.source -> xmltree (** Use this exception to indicate an error is functions given to [make_opts] used to read additional data from prefixed XML nodes. *) exception Error of string (** Options used when reading source. *) type ('a, 'b) opts (** See Neturl documentation for [schemes] and [base_syntax] options. They are used to parse URLs. @param read_channel_data provides a way to read additional information from the subnodes of the channels. All these subnodes are prefixed by an expanded namespace. @param read_item_data is the equivalent of [read_channel_data] parameter but is called of each item with its prefixed subnodes. *) val make_opts : ?schemes: (string, Neturl.url_syntax) Hashtbl.t -> ?base_syntax: Neturl.url_syntax -> ?read_channel_data: (xmltree list -> 'a option) -> ?read_item_data: (xmltree list -> 'b option) -> unit -> ('a, 'b) opts val default_opts : (unit, unit) opts (** [channel_[t_]of_X] returns the parsed channel and a list of encountered errors. Note that only namespaces declared in the root not of the XML tree are added to [ch_namespaces] field. @raise Failure if the channel could not be parsed. *) val channel_t_of_file : ('a, 'b) opts -> string -> (('a, 'b) channel_t * string list) val channel_t_of_string : ('a, 'b) opts -> string -> (('a, 'b) channel_t * string list) val channel_t_of_channel : ('a, 'b) opts -> in_channel -> (('a, 'b) channel_t * string list) (** Read a channel from XML trees. These trees correspond to nodes under the ["channel"] XML node of a reguler RSS document. *) val channel_t_of_xmls : ('a, 'b) opts -> xmltree list -> (('a, 'b) channel_t * string list) val channel_of_file : string -> (channel * string list) val channel_of_string : string -> (channel * string list) val channel_of_channel : in_channel -> (channel * string list) (** {2 Writing channels} *) type 'a data_printer = 'a -> xmltree list val print_channel : ?channel_data_printer: 'a data_printer -> ?item_data_printer: 'b data_printer -> ?indent: int -> ?date_fmt: string -> ?encoding: string -> Format.formatter -> ('a, 'b) channel_t -> unit val print_file : ?channel_data_printer: 'a data_printer -> ?item_data_printer: 'b data_printer -> ?indent: int -> ?date_fmt: string -> ?encoding: string -> string -> ('a, 'b) channel_t -> unit ocamlrss-2.2.2/rss_io.ml000066400000000000000000000564701220015140000151650ustar00rootroot00000000000000(******************************************************************************) (* OCamlrss *) (* *) (* Copyright (C) 2004-2013 Institut National de Recherche en Informatique *) (* et en Automatique. All rights reserved. *) (* *) (* This program is free software; you can redistribute it and/or modify *) (* it under the terms of the GNU Lesser General Public License version *) (* 3 as published by the Free Software Foundation. *) (* *) (* This program is distributed in the hope that it will be useful, *) (* but WITHOUT ANY WARRANTY; without even the implied warranty of *) (* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *) (* GNU Library General Public License for more details. *) (* *) (* You should have received a copy of the GNU Library General Public *) (* License along with this program; if not, write to the Free Software *) (* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA *) (* 02111-1307 USA *) (* *) (* Contact: Maxence.Guesdon@inria.fr *) (* *) (* *) (******************************************************************************) open Xmlm open Rss_types (** Parsing/Printing RSS documents. *) type xmltree = E of Xmlm.tag * xmltree list | D of string let string_of_xml ?ns_prefix ?indent tree = try let b = Buffer.create 256 in let output =Xmlm.make_output ?ns_prefix ~indent ~decl: false (`Buffer b) in let frag = function | E (tag, childs) -> `El (tag, childs) | D d -> `Data d in Xmlm.output_doc_tree frag output (None, tree); Buffer.contents b with Xmlm.Error ((line, col), error) -> let msg = Printf.sprintf "Line %d, column %d: %s" line col (Xmlm.error_message error) in failwith msg ;; let source_string = function `String (n, s) -> String.sub s n (String.length s - n) | `Channel _ | `Fun _ -> "" ;; (* [trim] borrowed from [String] to ensure that this code also compiles with OCaml 3.12. *) let is_space = function | ' ' | '\012' | '\n' | '\r' | '\t' -> true | _ -> false let trim s = let len = String.length s in let i = ref 0 in while !i < len && is_space (String.unsafe_get s !i) do incr i done; let j = ref (len - 1) in while !j >= !i && is_space (String.unsafe_get s !j) do decr j done; if !i = 0 && !j = len - 1 then s else if !j >= !i then String.sub s !i (!j - !i + 1) else "" let xml_of_source source = try let input = Xmlm.make_input ~strip:false ~enc: (Some `UTF_8) (*~entity: (fun s -> Some s)*) source in let el tag childs = E (tag, childs) in let data d = D (trim d) in let (_, tree) = Xmlm.input_doc_tree ~el ~data input in tree with Xmlm.Error ((line, col), error) -> let msg = Printf.sprintf "Line %d, column %d: %s\n%s" line col (Xmlm.error_message error) (source_string source) in failwith msg | Invalid_argument e -> let msg = Printf.sprintf "%s:\n%s" e (source_string source) in failwith msg ;; (** {2 Parsing} *) type ('a, 'b) opts = { schemes : (string, Neturl.url_syntax) Hashtbl.t ; base_syntax : Neturl.url_syntax ; mutable errors : string list ; read_channel_data : (xmltree list -> 'a option) option ; read_item_data : (xmltree list -> 'b option) option ; } let add_error opts msg = opts.errors <- msg :: opts.errors;; exception Error of string;; let error msg = raise (Error msg);; let find_ele name e = match e with E ((("",e),_),_) when name = String.lowercase e -> true | E ((("http://purl.org/rss/1.0/",e),_),_) when name = String.lowercase e -> true | _ -> false let apply_opt f = function None -> None | Some v -> Some (f v) let url_of_string opts s = try Neturl.parse_url ~schemes: opts.schemes ~base_syntax: opts.base_syntax ~accept_8bits: true ~enable_fragment: true s with Neturl.Malformed_URL -> error (Printf.sprintf "Malformed url %S" s) let get_att ?ctx ?(required=true) atts name = let name = String.lowercase name in try snd (List.find (function (("",s),_) -> String.lowercase s = name | _ -> false) atts) with Not_found -> if required then match ctx with None -> raise Not_found | Some (opts, tag) -> let msg = Printf.sprintf "Attribute %S not found in tag %S" name tag in add_error opts msg; raise Not_found else "" let get_opt_att atts name = let name = String.lowercase name in try Some (snd (List.find (function (("",s), _) -> String.lowercase s = name | _ -> false) atts) ) with Not_found -> None let get_source opts xmls = try match List.find (find_ele "source") xmls with E ((_,atts),[D s]) -> Some { src_name = s ; src_url = url_of_string opts (get_att atts "url") ; } | _ -> None with | Error msg -> add_error opts msg; None | _ -> None let get_enclosure opts xmls = try match List.find (find_ele "enclosure") xmls with E ((_,atts),_) -> let ctx = (opts, "enclosure") in Some { encl_url = url_of_string opts (get_att ~ctx atts "url") ; encl_length = int_of_string (get_att ~ctx atts "length") ; encl_type = get_att ~ctx atts "type" ; } | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None let get_categories opts xmls = let f acc = function E ((("",tag),atts),[D s]) when String.lowercase tag = "category"-> begin try { cat_name = s ; cat_domain = apply_opt (url_of_string opts) (get_opt_att atts "domain") ; } :: acc with Error msg -> add_error opts msg; acc end | _ -> acc in List.rev (List.fold_left f [] xmls) let get_guid opts xmls = try match List.find (find_ele "guid") xmls with E ((_,atts), [D s]) -> let x = match get_att ~required: false atts "ispermalink" with "true" -> Guid_permalink (url_of_string opts s) | _ -> Guid_name s in Some x | _ -> None with | Error msg -> add_error opts msg; None | _ -> None let get_cloud opts xmls = try match List.find (find_ele "cloud") xmls with E ((_,atts), _) -> let get = get_att ~ctx: (opts, "cloud") atts in let port = let port = get "port" in try int_of_string port with _ -> error (Printf.sprintf "Invalid cloud port %S" port) in Some { cloud_domain = get "domain" ; cloud_port = port ; cloud_path = get "path" ; cloud_register_procedure = get "registerprocedure" ; cloud_protocol = get "protocol" ; } | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None let find_sub_cdata = fun tag xmls name -> try match List.find (find_ele name) xmls with E ((_,_),[D s]) -> s | E ((_,_),[]) -> "" | _ -> let msg = Printf.sprintf "Invalid contents for node %S under %S" name tag in error msg with Not_found -> let msg = Printf.sprintf "No node %S in node %S" name tag in error msg let get_image opts xmls = try match List.find (find_ele "image") xmls with E ((_,atts),subs) -> let f = find_sub_cdata "image" xmls in let f_opt s = try match List.find (find_ele s) subs with E ((_,_),[D s]) -> Some (f s) | _ -> None with _ -> None in Some { image_url = url_of_string opts (f "url") ; image_title = f "title" ; image_link = url_of_string opts (f "link") ; image_width = apply_opt int_of_string (f_opt "width") ; image_height = apply_opt int_of_string (f_opt "height") ; image_desc = f_opt "description" ; } | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None let get_text_input opts xmls = try match List.find (find_ele "textinput") xmls with E ((_,atts),subs) -> let f = find_sub_cdata "textInput" xmls in Some { ti_title = f "title" ; ti_desc = f "description" ; ti_name = f "name" ; ti_link = url_of_string opts (f "link") ; } | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None let filter_prefixed_nodes = List.filter (function | D _ | E ((("",_),_), _) -> false | _ -> true ) ;; let read_ns_data f xmls = match f with None -> None | Some f -> let xmls = filter_prefixed_nodes xmls in f xmls let item_of_xmls opts xmls = let f_opt s = try Some (find_sub_cdata "item" xmls s) with _ -> None in let date = match f_opt "pubdate" with None -> None | Some s -> try Some (Netdate.parse s) with _ -> add_error opts (Printf.sprintf "Invalid date %S" s); None in let data = try read_ns_data opts.read_item_data xmls with Error msg -> add_error opts msg; None in try let item = { item_title = f_opt "title" ; item_link = apply_opt (url_of_string opts) (f_opt "link") ; item_desc = f_opt "description" ; item_pubdate = date ; item_author = f_opt "author" ; item_categories = get_categories opts xmls ; item_comments = apply_opt (url_of_string opts) (f_opt "comments") ; item_enclosure = get_enclosure opts xmls ; item_guid = get_guid opts xmls ; item_source = get_source opts xmls ; item_data = data ; } in Some item with Error msg -> add_error opts msg ; None let items_of_xmls opts xmls = List.rev (List.fold_left (fun acc e -> match e with D _ -> acc | E ((("",s),_),subs) when String.lowercase s = "item" -> begin match item_of_xmls opts subs with None -> acc | Some item -> item :: acc end | E _ -> acc ) [] xmls ) let get_skip_hours opts xmls = try let f_hour acc = function E ((("","hour"),_),[D s]) -> begin match try let h = int_of_string s in if h < 0 || h > 23 then failwith "" ; Some h with _ -> add_error opts (Printf.sprintf "Invalid hour %S" s); None with None -> acc | Some h -> h :: acc end | _ -> acc in match List.find (find_ele "skiphours") xmls with E ((_,_),subs) -> Some (List.sort Pervasives.compare (List.fold_left f_hour [] subs)) | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None ;; let int_of_day = function "sunday" -> 0 | "monday" -> 1 | "tuesday" -> 2 | "wednesday" -> 3 | "thursday" -> 4 | "friday" -> 5 | "saturday" -> 6 | s -> failwith (Printf.sprintf "Invalid day %S" s) let day_of_int = function 0 -> "Sunday" | 1 -> "Monday" | 2 -> "Tuesday" | 3 -> "Wednesday" | 4 -> "Thursday" | 5 -> "Friday" | 6 -> "Saturday" | n -> failwith ("Invalid day "^(string_of_int n)) ;; let get_skip_days opts xmls = let f_day acc = function E ((("", "day"), _), [D day]) -> begin try (int_of_day day) :: acc with Failure msg -> add_error opts msg; acc end | _ -> acc in try match List.find (find_ele "skipdays") xmls with E ((_,_),subs) -> Some (List.sort Pervasives.compare (List.fold_left f_day [] subs)) | D _ -> assert false with | Error msg -> add_error opts msg; None | _ -> None let channel_of_xmls opts xmls = let f s = try find_sub_cdata "channel" xmls s with Error msg -> failwith msg in let f_opt s = try Some (f s) with _ -> None in let pubdate = match f_opt "pubdate" with None -> None | Some s -> try Some (Netdate.parse s) with _ -> add_error opts (Printf.sprintf "Invalid date %S" s); None in let builddate = match f_opt "lastbuilddate" with None -> None | Some s -> try Some (Netdate.parse s) with _ -> add_error opts (Printf.sprintf "Invalid date %S" s); None in let ttl = match f_opt "ttl" with None -> None | Some s -> try Some (int_of_string s) with _ -> add_error opts (Printf.sprintf "Invalid ttl %S" s); None in let link = try url_of_string opts (f "link") with Error msg -> failwith msg in let docs = try apply_opt (url_of_string opts) (f_opt "docs") with Error msg -> add_error opts msg; None in let data = try read_ns_data opts.read_channel_data xmls with Error msg -> add_error opts msg; None in { ch_title = f "title" ; ch_link = link ; ch_desc = f "description" ; ch_language = f_opt "language" ; ch_copyright = f_opt "copyright" ; ch_managing_editor = f_opt "managingeditor" ; ch_webmaster = f_opt "webmaster" ; ch_pubdate = pubdate ; ch_last_build_date = builddate ; ch_categories = get_categories opts xmls ; ch_generator = f_opt "generator" ; ch_cloud = get_cloud opts xmls ; ch_docs = docs ; ch_ttl = ttl ; ch_image = get_image opts xmls ; ch_rating = f_opt "rating" ; ch_text_input = get_text_input opts xmls ; ch_skip_hours = get_skip_hours opts xmls ; ch_skip_days = get_skip_days opts xmls ; ch_items = items_of_xmls opts xmls ; ch_data = data ; ch_namespaces = [] ; } let channel_of_source opts source = let xml = xml_of_source source in let opts = { opts with errors = [] } in match xml with | D _ -> failwith "Parse error: not an element" | E (((_,e), atts), subs) -> let (channel, errors) = match String.lowercase e with "rss" -> ( try let elt = List.find (find_ele "channel") subs in match elt with E ((("",_), atts), subs) -> (channel_of_xmls opts subs, opts.errors) | _ -> assert false with Not_found -> failwith "Parse error: no channel" ) | "rdf" -> ( try let elt = List.find (find_ele "channel") subs in match elt with | E ((_, atts), subs_ch)-> (channel_of_xmls opts (subs_ch @ subs), opts.errors) | _ -> assert false with Not_found -> failwith "Parse error: not channel" ) | _ -> failwith "Parse error: not rss" in let namespaces = let f ((prefix, name), value) acc = if prefix = Xmlm.ns_xmlns && name <> "xmlns" then (name, value) :: acc else acc in List.fold_right f atts [] in ({ channel with ch_namespaces = namespaces }, errors) let make_opts ?(schemes=Neturl.common_url_syntax) ?(base_syntax=Hashtbl.find Neturl.common_url_syntax "http") ?read_channel_data ?read_item_data () = { schemes ; base_syntax ; errors = [] ; read_item_data ; read_channel_data ; } let default_opts = make_opts ();; let channel_of_string opts s = channel_of_source opts (`String (0, s)) let channel_of_file opts file = let ic = open_in file in try channel_of_source opts (`Channel ic) with e -> close_in ic; raise e ;; let channel_of_channel opts ch = channel_of_source opts (`Channel ch);; let channel_of_xmls opts xmls = (channel_of_xmls opts xmls, opts.errors) (** {2 Printing} *) let opt_element opt s = match opt with None -> [] | Some v -> [E ((("",s), []), [D v])] let default_date_format = "%d %b %Y %T %z" (* ex: 19 May 2002 15:21:36 *) let err_date d = () (* let module D = Netdate in let p = Printf.eprintf in prerr_endline "{"; p "year = %d\n" d.D.year ; p "month = %d\n" d.D.month ; p "day = %d\n" d.D.day ; p "hour = %d\n" d.D.hour ; p "minute = %d\n" d.D.minute ; p "second = %d\n" d.D.second ; p "zone = %d\n" d.D.zone ; p "week_day = %d\n" d.D.week_day ; prerr_endline "}" *) let xml_of_category c = let atts = match c.cat_domain with None -> [] | Some d -> [("","domain"), Neturl.string_of_url d] in E ((("","category"), atts), [D c.cat_name]) let xmls_of_categories l = List.map xml_of_category l let xmls_of_opt_f f v_opt = match v_opt with None -> [] | Some v -> [f v] let xml_of_enclosure e = E ((("","enclosure"), [ ("","url"), Neturl.string_of_url e.encl_url ; ("","length"), string_of_int e.encl_length ; ("","type"), e.encl_type ; ]), [] ) let xmls_of_enclosure_opt = xmls_of_opt_f xml_of_enclosure let xml_of_guid = function Guid_permalink url -> E ((("","guid"), [("","isPermaLink"), "true"]), [D (Neturl.string_of_url url)] ) | Guid_name name -> E ((("","guid"), []), [D name]) let xmls_of_guid_opt = xmls_of_opt_f xml_of_guid let xml_of_source_field s = E ((("", "source"), [("","url"), (Neturl.string_of_url s.src_url)]), [D s.src_name] ) let xmls_of_source_opt = xmls_of_opt_f xml_of_source_field let xml_of_image i = E ((("", "image"), []), [ E((("","url"),[]), [D (Neturl.string_of_url i.image_url)]) ; E((("","title"),[]), [D i.image_title]) ; E((("","link"),[]), [D (Neturl.string_of_url i.image_link)]) ] @ (List.flatten [ opt_element (apply_opt string_of_int i.image_width) "width"; opt_element (apply_opt string_of_int i.image_height) "height"; opt_element i.image_desc "description" ; ] ) ) let xmls_of_image_opt = xmls_of_opt_f xml_of_image let xml_of_cloud c = let atts = [ ("","domain"), c.cloud_domain ; ("","port"), string_of_int c.cloud_port ; ("","path"), c.cloud_path ; ("","registerProcedure"), c.cloud_register_procedure ; ("","protocol"), c.cloud_protocol ; ] in E ((("", "cloud"), atts), []) let xmls_of_cloud_opt = xmls_of_opt_f xml_of_cloud let xml_of_skip_hours = let f h = E ((("","hour"), []), [D (string_of_int h)]) in fun hours -> E ((("","hours"), []), List.map f hours) ;; let xmls_of_skip_hours_opt = xmls_of_opt_f xml_of_skip_hours let xml_of_skip_days = let f day = let s = day_of_int day in E ((("","day"), []), [D s]) in fun days -> E ((("","days"), []), List.map f days) ;; let xmls_of_skip_days_opt = xmls_of_opt_f xml_of_skip_days let xml_of_text_input t = E ((("","textInput"), []), [ E((("","title"), []), [D t.ti_title]) ; E((("","description"),[]), [D t.ti_desc]) ; E((("","name"), []), [D t.ti_name]) ; E((("","link"), []), [D (Neturl.string_of_url t.ti_link)]) ; ] ) let xmls_of_text_input_opt = xmls_of_opt_f xml_of_text_input let xml_of_item ?item_data_printer ~date_fmt i = let data_xml = match i.item_data, item_data_printer with None, _ | _, None -> [] | Some data, Some p -> p data in E ((("","item"), []), (List.flatten [ opt_element i.item_title "title" ; opt_element (apply_opt Neturl.string_of_url i.item_link) "link" ; opt_element i.item_desc "description" ; opt_element (match i.item_pubdate with None -> None | Some d -> err_date d; Some (Netdate.format ~fmt: date_fmt d)) "pubDate" ; opt_element i.item_author "author" ; xmls_of_categories i.item_categories ; opt_element (apply_opt Neturl.string_of_url i.item_comments) "comments" ; xmls_of_enclosure_opt i.item_enclosure ; xmls_of_guid_opt i.item_guid ; xmls_of_source_opt i.item_source ; data_xml ; ] ) ) let xml_of_channel ?channel_data_printer ?item_data_printer ~date_fmt ch = let f v s = E ((("",s), []), [D v]) in let data_xml = match ch.ch_data, channel_data_printer with None, _ | _, None -> [] | Some data, Some p -> p data in let xml_ch = E ((("","channel"), []), ( [ f ch.ch_title "title" ; f (Neturl.string_of_url ch.ch_link) "link" ; f ch.ch_desc "description" ; ] @ (List.flatten [ opt_element ch.ch_language "language" ; opt_element ch.ch_copyright "copyright" ; opt_element ch.ch_managing_editor "managingEditor" ; opt_element ch.ch_webmaster "webMaster" ; opt_element (match ch.ch_pubdate with None -> None | Some d -> err_date d ; Some (Netdate.format ~fmt: date_fmt d) ) "pubDate" ; opt_element (match ch.ch_last_build_date with None -> None | Some d -> err_date d ; Some (Netdate.format ~fmt: date_fmt d)) "lastBuildDate" ; xmls_of_categories ch.ch_categories ; opt_element ch.ch_generator "generator" ; xmls_of_cloud_opt ch.ch_cloud ; opt_element (apply_opt Neturl.string_of_url ch.ch_docs) "docs" ; opt_element (apply_opt string_of_int ch.ch_ttl) "ttl"; xmls_of_image_opt ch.ch_image ; opt_element ch.ch_rating "rating" ; xmls_of_text_input_opt ch.ch_text_input ; xmls_of_skip_hours_opt ch.ch_skip_hours ; xmls_of_skip_days_opt ch.ch_skip_days ; data_xml ; List.map (xml_of_item ?item_data_printer ~date_fmt) ch.ch_items ; ] ) ) ) in E ((("","rss"), [("","version"), "2.0"]), [xml_ch]) module SMap = Map.Make (struct type t = string let compare = Pervasives.compare end);; let print_channel ?channel_data_printer ?item_data_printer ?indent ?(date_fmt=default_date_format) ?(encoding="UTF-8")fmt ch = let xml = xml_of_channel ?channel_data_printer ?item_data_printer ~date_fmt ch in let known_ns = List.fold_left (fun map (name, url) -> SMap.add url name map) SMap.empty ch.ch_namespaces in let ns_prefix url = try Some (SMap.find url known_ns) with Not_found -> None in let xml = match xml with D _ -> assert false | E ((tag, atts), subs) -> let f acc (name, url) = ((Xmlm.ns_xmlns, name), url) :: acc in let atts = List.rev (List.fold_left f atts ch.ch_namespaces) in E ((tag, atts), subs) in Format.fprintf fmt "\n" encoding; Format.fprintf fmt "%s" (string_of_xml ~ns_prefix ?indent xml ) ocamlrss-2.2.2/rss_types.ml000066400000000000000000000157631220015140000157220ustar00rootroot00000000000000(******************************************************************************) (* OCamlrss *) (* *) (* Copyright (C) 2004-2013 Institut National de Recherche en Informatique *) (* et en Automatique. All rights reserved. *) (* *) (* This program is free software; you can redistribute it and/or modify *) (* it under the terms of the GNU Lesser General Public License version *) (* 3 as published by the Free Software Foundation. *) (* *) (* This program is distributed in the hope that it will be useful, *) (* but WITHOUT ANY WARRANTY; without even the implied warranty of *) (* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *) (* GNU Library General Public License for more details. *) (* *) (* You should have received a copy of the GNU Library General Public *) (* License along with this program; if not, write to the Free Software *) (* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA *) (* 02111-1307 USA *) (* *) (* Contact: Maxence.Guesdon@inria.fr *) (* *) (* *) (******************************************************************************) (** *) type email = string (** can be, for example: foo\@bar.com (Mr Foo Bar) *) type pics_rating = string type skip_hours = int list (** 0 .. 23 *) type skip_days = int list (** 0 is Sunday, 1 is Monday, ... *) type url = Neturl.url type category = { cat_name : string ; cat_domain : url option ; } type image = { image_url : url ; image_title : string ; image_link : url ; image_height : int option ; image_width : int option ; image_desc : string option ; } type text_input = { ti_title : string ; (** The label of the Submit button in the text input area. *) ti_desc : string ; (** Explains the text input area. *) ti_name : string ; (** The name of the text object in the text input area. *) ti_link : url ; (** The URL of the CGI script that processes text input requests. *) } type enclosure = { encl_url : url ; (** URL of the enclosure *) encl_length : int ; (** size in bytes *) encl_type : string ; (** MIME type *) } (** See {{:http://cyber.law.harvard.edu/rss/soapMeetsRss.html#rsscloudInterface} specification} *) type cloud = { cloud_domain : string ; cloud_port : int ; cloud_path : string ; cloud_register_procedure : string ; cloud_protocol : string ; } type guid = Guid_permalink of url | Guid_name of string type source = { src_name : string ; src_url : url ; } type 'a item_t = { item_title : string option ; item_link : url option ; item_desc : string option ; item_pubdate : Netdate.t option ; item_author : email option ; item_categories : category list ; item_comments : url option ; item_enclosure : enclosure option ; item_guid : guid option ; item_source : source option ; item_data : 'a option ; } type ('a, 'b) channel_t = { ch_title : string ; ch_link : url ; ch_desc : string ; ch_language : string option ; ch_copyright : string option ; ch_managing_editor : email option ; ch_webmaster : email option ; ch_pubdate : Netdate.t option ; ch_last_build_date : Netdate.t option ; ch_categories : category list ; ch_generator : string option ; ch_cloud : cloud option ; ch_docs : url option ; ch_ttl : int option ; ch_image : image option ; ch_rating : pics_rating option ; ch_text_input : text_input option ; ch_skip_hours : skip_hours option ; ch_skip_days : skip_days option ; ch_items : 'b item_t list ; ch_data : 'a option ; ch_namespaces : (string * string) list ; } type item = unit item_t type channel = (unit, unit) channel_t let rec apply_comp item1 item2 = function [] -> 0 | f :: q -> match f item1 item2 with 0 -> apply_comp item1 item2 q | n -> n ;; let compare_opt f x y = match x, y with | Some _, None -> 1 | None, Some _ -> -1 | None, None -> 0 | Some x, Some y -> f x y ;; let compare_list f = let rec iter = function | [], [] -> 0 | [], _ -> -1 | _, [] -> 1 | (h1 :: q1), (h2 :: q2) -> match f h1 h2 with 0 -> iter (q1, q2) | n -> n in fun l1 l2 -> iter (l1, l2) let compare_url url1 url2 = Pervasives.compare (Neturl.string_of_url url1) (Neturl.string_of_url url2) let compare_url_opt = compare_opt compare_url;; let compare_enclosure e1 e2 = Pervasives.compare (Neturl.string_of_url e1.encl_url) (Neturl.string_of_url e2.encl_url) ;; let compare_guid g1 g2 = match g1, g2 with | Guid_permalink url1, Guid_permalink url2 -> Pervasives.compare (Neturl.string_of_url url1) (Neturl.string_of_url url2) | Guid_permalink _, Guid_name _ -> 1 | Guid_name _, Guid_permalink _ -> -1 | Guid_name s1, Guid_name s2 -> Pervasives.compare s1 s2 ;; let compare_source s1 s2 = match compare_url s1.src_url s2.src_url with 0 -> Pervasives.compare s1.src_name s2.src_name | n -> n ;; let compare_category c1 c2 = match compare_url_opt c1.cat_domain c2.cat_domain with 0 -> Pervasives.compare c1.cat_name c2.cat_name | n -> n ;; let item_comp_funs = [ (fun i1 i2 -> compare_url_opt i1.item_link i2.item_link) ; (fun i1 i2 -> Pervasives.compare i1.item_title i2.item_title) ; (fun i1 i2 -> Pervasives.compare i1.item_desc i2.item_desc) ; (fun i1 i2 -> Pervasives.compare i1.item_pubdate i2.item_pubdate) ; (fun i1 i2 -> Pervasives.compare i1.item_author i2.item_author) ; (fun i1 i2 -> compare_list compare_category i1.item_categories i2.item_categories) ; (fun i1 i2 -> compare_url_opt i1.item_comments i2.item_comments) ; (fun i1 i2 -> compare_opt compare_enclosure i1.item_enclosure i2.item_enclosure) ; (fun i1 i2 -> compare_opt compare_guid i1.item_guid i2.item_guid) ; (fun i1 i2 -> compare_opt compare_source i1.item_source i2.item_source) ; ] ;; let compare_item ?comp_data = let comp_funs = match comp_data with None -> item_comp_funs | Some f -> (fun i1 i2 -> compare_opt f i1.item_data i2.item_data) :: item_comp_funs in fun item1 item2 -> apply_comp item1 item2 comp_funs ;; ocamlrss-2.2.2/rsstest.ml000066400000000000000000000050521220015140000153640ustar00rootroot00000000000000(******************************************************************************) (* OCamlrss *) (* *) (* Copyright (C) 2004-2013 Institut National de Recherche en Informatique *) (* et en Automatique. All rights reserved. *) (* *) (* This program is free software; you can redistribute it and/or modify *) (* it under the terms of the GNU Lesser General Public License version *) (* 3 as published by the Free Software Foundation. *) (* *) (* This program is distributed in the hope that it will be useful, *) (* but WITHOUT ANY WARRANTY; without even the implied warranty of *) (* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *) (* GNU Library General Public License for more details. *) (* *) (* You should have received a copy of the GNU Library General Public *) (* License along with this program; if not, write to the Free Software *) (* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA *) (* 02111-1307 USA *) (* *) (* Contact: Maxence.Guesdon@inria.fr *) (* *) (* *) (******************************************************************************) (* Testing *) let fatal msg = prerr_endline msg; exit 1;; let main () = if Array.length Sys.argv < 2 then fatal (Printf.sprintf "Usage: %s " Sys.argv.(0)); try let opts = Rss.make_opts ~read_channel_data: (fun x -> Some x) ~read_item_data: (fun x -> Some x) () in let (channel, errors) = Rss.channel_t_of_file opts Sys.argv.(1) in let printer xmls = xmls in Rss.print_channel ~channel_data_printer: printer ~item_data_printer: printer ~indent: 2 Format.std_formatter channel; List.iter prerr_endline errors with | Sys_error s | Failure s -> fatal s ;; let () = main ();;ocamlrss-2.2.2/test.rss000066400000000000000000000105631220015140000150360ustar00rootroot00000000000000 Cameleon http://home.gna.org/cameleon/cameleon_news.rss News about Cameleon. en zoggy@bat8.org zoggy@bat8.org 02 Feb 2007 15:00:00 +0000 13 Jan 2010 11:30:00 +0000 1.9.21 is available http://download.gna.org/cameleon/cameleon_1.9.21.tar.gz Libraries Okey, Configwin, Gtksv_utils and gtk utilities (Gdir, Gmylist, Gmytree, Gstuff) are now in a different library out of Cameleon2: Lablgtk-extras (hosted on ocaml forge). Library Config_file is now out of Cameleon2 and hosted on ocaml forge. <pre><code>| p1 | p2 | |---------+---------| | x &lt;- 1 | y &lt;- 1 | | r1 &lt;- y | r2 &lt;- x |</code></pre> 13 Jan 2011 11:30:00 +0000 1.9.20 is available http://download.gna.org/cameleon/cameleon_1.9.20.tar.gz Changes: new commands in ocaml-mode, compatibility with ocaml 3.12.0, various fixes and improvments. 17 Jun 2010 09:00:00 +0000 1.9.19 is available http://download.gna.org/cameleon/cameleon_1.9.19.tar.gz Changes: now use GtkSourceView 2.X, new multi-clipboard view in Chamo (the view is called "multiclip"), new Custop module (to create Gtk interface to toplevel programs), new Ed_outputs module (to display various command outputs), ocamlbuild support in ocaml-mode, various things I don't remember of, lots of minor fixes. 24 Sep 2009 07:30:00 +0000 Cameleon2/Chamo at the 2nd OCaml Meeting http://home.gna.org/cameleon/cameleon_ocaml_meeting_2009.pdf A talk was given about Cameleon2 and Chamo at the second OCaml Meeting, in Grenoble on 4th february 2009. The slides are available. 05 Feb 2009 16:00:00 +0000 1.9.18 is available http://download.gna.org/cameleon/cameleon_1.9.18.tar.gz Bug fixes and enhancements in Chamo: Handle the encoding of files (ability to specify an encoding when opening a file, ability to change the encoding of a file), keep the date when saving a file to be able to detect modifications of a file while it is edited in chamo, keep some buffer attributes in the chamo files, like the encoding, the mode, ... Cameleon does not act as server by default (new command cam_start_server). OCaml-RSS is now included in Cameleon. 02 Feb 2007 16:00:00 +0000 1.9.17 is available http://download.gna.org/cameleon/cameleon_1.9.17.tar.gz Bug fixes and enhancements in Chamo. Search, query-replace and completion are available in Chamo, with regular expressions based on PCRE. 27 Nov 2006 16:00:00 +0000 1.9.16 is available http://download.gna.org/cameleon/cameleon_1.9.16.tar.gz The chamo editor is included and usable in cameleon. Many fixes and enhancements. 10 Oct 2006 12:00:00 +0000 1.9.15 is available http://download.gna.org/cameleon/cameleon_1.9.15.tar.gz Cameleon now includes chamo, a powerful text editor with OCaml as scripting language. 06 Oct 2006 09:44:00 +0000 1.9.13 is available http://download.gna.org/cameleon/cameleon_1.9.13.tar.gz A new configuration process which uses an OCaml script to detect required tools and libraries. Fixes and enhancements. 10 May 2006 11:30:07 +0000