FreenetWiki : AnotherFreenetIndexFormat

HomePage :: Categories :: PageIndex :: RecentChanges :: RecentlyCommented :: Login/Register

Another Freenet Index Format


With this format, the idea is to provide a format more suitable to binary files (videos, audio, etc), but still able to index text files by words.
All suggestions are, of course, welcome :)

IMPORTANT NOTICE


This format is not definitive ! If you want to use it, you're strongly encouraged to subscribe to the RSS feed attached to this page.

Why XML ?



(comment: Some people, when confronted with a problem, think "I know, I'll use XML." Now they have two problems. )

Sample


Index format

<?xml version='1.0' encoding='utf-8'?>

<index>

<header>
  <title>Index title</title>
  <owner>Index owner nickname</owner><!-- optionnal -->
  <date>YYYYMMDD</date><!-- insertion date --><!-- optionnal -->
  <email>email</email><!-- optionnal -->
  <client>software (for example Thaw 0.7 rxxx)</client><!-- optionnal -->

  <!-- Security note: If you already know a privateKey for the given index, don't erase it with the one found in the index
	     else a bad guy could easily do some nasty things like blocking a public index ... -->
  <!-- Re-security note: If you already have a privateKey: check the both matches : If not, don't use it and don't republish the privateKey ! -->
  <!-- Re-re-node : If you're using FCP, take care of a possible FCP injection -->
  <privateKey>SSK@[...]</privateKey><!-- Optionnal, of course :) -->

  <!-- Optionnal -->
  <!-- category[/subCategory[/subsubCategory[...]]] -->
  <!-- Used for the auto-sorting in Thaw -->
  <!-- No case sensitivity -->
  <category>freenet/thaw</category>
</header>

<indexes>
	<!-- This can be used by user to make links to other indexes -->
	<!-- Category is optional -->

	<link key="USK@[...]" category="freenet/thaw" /> 
	<link key="USK@[...]" />
	[...]
</indexes>

 <files>

	  <!-- a file size == 0 means that the file size is unknown -->
	  <file id="0"
	      key="CHK@[...]/thisIsAFile.avi"
	      size="5242880"
	      mime="video/x-msvideo">

	      <!-- Options are defined by file format filters or by the user himself -->
	      <option name="length" value="300" /><!-- In seconds -->
	      <option name="category" value="reportage" />
	      <option name="lastDownload" value="19700411" />
	     </file>

	<file id="1"
	    key="USK@[...]/thisIsAnHTMLFile.html"
	    size="10240"
	    mime="text/html">

	      <option name="title" value="This a file in HTML" />
	      <option name="author" value="Someone" />
	      <option name="lastDownload" value="20060603" />
	</file>

	<file id="2"
	    key="USK@[...]/thisIsAFile.odt"
	    size="20480"
	    mime="application/vnd.oasis.opendocument.text">
 
  	     <option name="title" value="This an OpenDocument" />
	     <option name="author" value="Someone else" />
	     <option name="lastDownload" value="20060603" />
	</file>

	[...]

  </files>

<keywords>
	 <!-- v = value -->
	 <!-- <file id="file_id">position, position</file> --> 
	 <!-- Negative position means it's in the filename -->
	 <!-- Positions are always counted started from 1 -->
	 <word v="ThisIsAWord"><!-- w == word -->
	   <file id="1">2,3</file><!-- f == file -->
	   <file id="3">1,8</file><!-- values inside <f></f> are word positions inside the given file -->
	   <file id="7">12,1</file>
	 </word>
	 <word v="HTML">
	   <file id="1">-4</file>
	 </word>

	 [...]

	  <!-- Sub-indexes list, for index splitting -->
	  <!-- Splitting is done based on the word first letters -->
	  <subIndex key="CHK@[...]">
	    <wordsStartingWith>a</wordsStartingWith>
	  </subIndex>
	
	  <subIndex key="CHK@[...]">
	    <wordsStartingWith>bcd</wordsStartingWith>
	    <wordsStartingWith>brouzouf</wordsStartingWith>
	 </subIndex>
</keywords>

<!-- If comments are activated: -->
<comments publicKey="SSK@[...]/" privateKey="SSK@[...]/">
	<!-- When you insert a comment, insert it as USK@[...]/comment/0/comment.xml ; the node will automagically put the latest revision -->
	<!-- See the format below -->
	<!-- To get the comments : do it manually with SSK@[...]comment-[rev]/comment.xml -->
	<!-- Always start fetching the comments from 0, even if you already know them : it avoid loosing comments with time -->
	<!-- Some comments may be missing, so you can try to fetch immediatly 5 comments at once -->
	<!-- If the index owner change the keys, purge all the message that your client know -->
   
	<blackListed rev="5" /> <!-- ignore this comment -->
	<blackListed rev="7" /> <!-- ignore this comment -->
</comments>

</index>


Sub-index format


This one is quite similar to the previous, hoping that it will help devs to reuse their code.

<?xml version='1.0' encoding='utf-8'?>

<index>

<files>
	<!-- Mode lazy on : *** Put a file list specific to this sub-index here *** (see format used in the main index) -->
	<!-- these files don't need to already be in the main index -->
<files>

<keywords>
	 <!-- v = value  -->
	 <!-- <file id="file_id">position,position</file> --> 
	 <!-- Negative position means it's in the filename -->
	 <!-- Positions are always counted started from 1 -->
	 <!-- File ids must correspond to files defined in this sub-index -->
	 <word v="AWord">
	   <file id="1">2,3</file>
	   <file id="3">1,8</file>
	   <file id="7">12,1</file>
	 </word>
	 <word v="Another">
	   <file id="1">-5</file>
	 </word>

	 [...]

	  <!-- Sub-indexes list, for index splitting -->
	  <!-- Splitting is done based on the word first letters -->
	  <subIndex key="CHK@[...]">
	    <wordsStartingWith>abc</wordsStartingWith>
	  </subIndex>
	
	  <subIndex key="CHK@[...]">
	    <wordsStartingWith>acd</wordsStartingWith>
	    <wordsStartingWith>azz</wordsStartingWith>
	 </subIndex>	
</keywords>

</index>


Comment format


<?xml version='1.0' encoding='utf-8'?>
<comment>

<author>putANickNameHere</author> <!-- at display, Thaw will add "@"+Base64.encode(SHA256.digest(y)) -->

<text>putACommentHere</text>

<signature><!-- All the comments must be signed. NO EXCEPTION -->

<!-- signature in Thaw is generated using the class RSA implementation in Frost (itself using BouncyCastle) -->
<!-- Signature is generated from the following content : -->
<!-- comment publicKey (starting with 'SSK@' and ending to the first '/' (included))
	nick name of the user (without the "@"+hashOfThePublicKey)
	text of the comment
-->
<!-- element1 + "-" + element2 + "-" + [...]  -->
<!-- The recommandation is to use the signature to know if a message is already in database or not -->

  <sig>[...]</sig>

  <publicKey>[...]</publicKey>

</signature>

</comment>


Recommendations


Valid XHTML 1.0 Transitional :: Valid CSS :: Powered by Wikka Wakka Wiki 1.1.6.2
Page was generated in 0.0737 seconds