Anonymous avatar Anonymous committed 0827a72

v1.3a4 - added a few pages to the docs (Getting started) and set entry point for widget contextual help.

Comments (0)

Files changed (40)

_textable/__init__.py

+doc_root = [
+    ("http://orange-textable.readthedocs.org/en/latest/", None),
+]
+
+

docs/rst/annotation.rst

+.. _Annotation:
+
 Annotation
 ==========
 

docs/rst/context.rst

+.. _Context:
+
 Context
 =======
 

docs/rst/convert.rst

+.. _Convert:
+
 Convert
 =======
 

docs/rst/converting_table_formats.rst

 Converting between table formats
 ================================
 
-In preparation.
+Orange Canvas has a "native" type for representing data tables, namely
+*ExampleTable*. However, this type does not support Unicode well, which is
+a serious limitation in the perspective of text processing. To overcome this
+issue (as much as possible), Orange Textable defines its own table
+representation format, simply called *Table*.
+
+Every :doc:`table construction widget <table_construction_widgets>` in Orange
+Textable emits data in *Table* format. Instances of these widget must then
+be connected with an instance of :ref:`Convert`, which has mainly
+two purposes:
+
+    -   It *converts* data in Orange Textable's *Table* format to the native
+        *ExampleTable* format of Orange Canvas, which makes it possible to
+        use the other widgets of Orange Canvas for visualizing, modifying,
+        analyzing, etc. tables built with Orange Textable.
+        
+    -   It *exports* data in *Table* format to text files, in tab-delimited
+        format, typically in order to import them later in a third party data
+        analysis software package; at the time of writing, this scenario is
+        the only way to correctly visualize a table containing data encoded in
+        Unicode.
+        
+As shown on :ref:`figure 1 <converting_table_formats_fig1>` below, section
+*Conversion* of the widget's interface lets the user choose the encoding
+of the *ExampleTable* object produced in output (**Orange table encoding**);
+variants of Unicode should be avoided here since they are currently not well
+supported by other widgets in Orange Canvas.
+
+.. _converting_table_formats_fig1:
+
+.. figure:: figures/convert_example.png
+    :align: center
+    :alt: Basic interface of widget Convert
+    :figclass: align-center
+
+    Figure 1: Basic interface of widget :ref:`Convert`.
+
+The encoding for text file export can be selected in the *Export* section
+(**Output file encoding**); in this case there are no counter-indications to
+the use of Unicode. Checkbox **Include Orange headers** triggers inclusion of
+additional table headers in the case where the output file should be later
+re-imported in Orange Canvas. Export proper is performed by clicking the
+**Export** button and selecting the output file in the dialog that appears.
+
+The take-home message here is this: when you create an instance of a
+:doc:`table construction widget <table_construction_widgets>`, you may
+systematically create a new instance of :ref:`Convert` and connect
+them together. Usually, moreover, you will want to connect the
+:ref:`Convert` instance to a *Data Table* instance (from the *Data*
+tab of Orange Canvas) in order to view the table just built--except in the
+case where it contains Unicode data that wouldn't display correctly in
+*Data table*.
 

docs/rst/count.rst

+.. _Count:
+
 Count
 =====
 

docs/rst/counting_segment_types.rst

 Counting segment types
 ======================
 
-In preparation.
+Widget :ref:`Count` takes in input one or more segmentations and
+produces frequency tables such as tables 1 and 2
+:doc:`here <segmentations_tables>`. To try it out, create a scheme such as
+illustrated on :ref:`figure 1 <counting_segment_types_fig1>` below. As usual,
+we will suppose that the :ref:`Text Field` instance contains
+*a simple example*. The :ref:`Segment` instance is configured for
+letter segmentation (**Regex:** *\\w* and **Output segmentation label:**
+*letters*). The default configuration of the instances of
+:ref:`Convert` and *Data Table* (from the **Data** tab of Orange
+Canvas) needs not be modified for this example.
+
+.. _counting_segment_types_fig1:
+
+.. figure:: figures/count_example_scheme.png
+    :align: center
+    :alt: Scheme for testing the Count widget
+    :figclass: align-center
+
+    Figure 1: Scheme for testing the :ref:`Count` widget.
+
+Basically, the purpose of widget :ref:`Count` is to determine the frequency
+of segment types in an input segmentation. The label of that segmentation must
+be indicated in the **Segmentation** menu of section **Units** in the widget's
+interface, while other controls may be left in their default state for now
+(see :ref:`figure 2 <counting_segment_types_fig2>` below). Clicking
+**Compute** then double-clicking the *Data Table* instance should display
+essentially the same data as table 1
+:ref:`here <segmentations_tables_table1>` (with possible variations in
+the order of columns).
+
+.. _counting_segment_types_fig2:
+
+.. figure:: figures/count_example.png
+    :align: center
+    :alt: Counting the frequency of letter types with widget :ref:`Count`
+    :figclass: align-center
+
+    Figure 2: Counting the frequency of letter types with widget :ref:`Count`.
+
+Note that checkbox *Compute automatically* is unchecked by default so that
+the user must click on **Compute** to trigger computations. The motivation for
+this default setting is that
+:doc:`table construction widgets <table_construction_widgets>` can be quite
+slow when operating on large segmentations, and it can be annoying to see
+computations starting again whenever an interface element is modified.
+
+To obtain the frequency of letter *bigrams* (i.e. pairs of successive
+letters), simply set parameter **Sequence length** to 2 (see
+:ref:`table 1 <counting_segment_types_table1>` below). If the value of this
+parameter is greated than 1, the string specified in field **Intra-sequence
+delimiter** is inserted between successive segments for the sake of
+readability--which is more useful when segments are longer than individual
+letters. Note that in this example, word boundaries are not taken into
+account--nor even known, in fact--which is why bigrams *as* and *ee* have a
+nonzero frequency.
+
+.. _counting_segment_types_table1:
+
+.. csv-table:: Table 1: Letter bigram frequency.
+    :header: *as*, *si*, *im*, *mp*, *pl*, *le*, *ex*, *xa*, *am*
+    :stub-columns: 0
+    :widths: 3 3 3 3 3 3 3 3 3
+
+    1,   1,   1,   2,   2,   2,   1,   1,   1
+
+
+
 

docs/rst/display.rst

+.. _Display:
+
 Display
 =======
 

docs/rst/extract_xml.rst

+.. _Extract XML:
+
 Extract XML
 ===========
 
Add a comment to this file

docs/rst/figures/convert_example.png

Added
New image
Add a comment to this file

docs/rst/figures/count_example.png

Added
New image
Add a comment to this file

docs/rst/figures/count_example_scheme.png

Added
New image
Add a comment to this file

docs/rst/figures/solution_exercise_intersect.png

Added
New image

docs/rst/hierarchical_segmentations_performance_issues.rst

 Hierarchical segmentations and performance issues
 =================================================
 
-When widget :doc:`Segment <segment>` is applied to real, much longer texts
+When widget :ref:`Segment` is applied to real, much longer texts
 than *a simple example*, using such general regexes as *\\w+* or *\\w* may
 result in the creation of a huge number of segments. Creating and manipulating
 such segmentations can slow down excessively the execution of Orange Textable,
     :alt: chained hierarchical segmentations execute faster
     :figclass: align-center
 
-    Figure 1: Chaining :doc:`Segment <segment>` instances to reduce execution time.
+    Figure 1: Chaining :ref:`Segment` instances to reduce execution time.
 
 The situation is different when word or letter segmentation are conceived
 as intermediate steps toward the creation of a segmentation containing only
 selected words or letters. In that case, it is much more efficient (in memory
-and execution time) to use a single instance of :doc:`Segment <segment>` with
+and execution time) to use a single instance of :ref:`Segment` with
 a regex identifying only the desired words, as seen
 :doc:`previously <segmenting_data_smaller_units>`
 with the example of *\\bretriev(e|es|ed|ing)\\b*.

docs/rst/index.rst

 
 .. _SLI: http://www.unil.ch/sli
 
-Contents
---------
-
 .. toctree::
     :maxdepth: 3
    
     Widget reference <widget_reference>
     Cookbook <cookbook>
     Case studies <case_studies>
-
-Further resource
-----------------
-
-The project's homepage is hosted at `langtech.ch
-<http://langtech.ch/textable>`_. Links to the source repository
-and further resource may be found there.
-
-

docs/rst/intersect.rst

+.. _Intersect:
+
 Intersect
 =========
 

docs/rst/keyboard_input_segmentation_display.rst

 Keyboard input and segmentation display
 =======================================
 
-Typing text in a :doc:`Text Field <text_field>` widget is the simplest way to
+Typing text in a :ref:`Text Field` widget is the simplest way to
 import a string in Orange Textable. This widget has no input connexions, and
 emits in output a segmentation containing a single segment whose address
 points to the entire string that was typed. This segmentation is assigned the
     :alt: Example usage of widget Text Field
     :figclass: align-center
 
-    Figure 1: Typing *a simple example* in widget :doc:`Text Field <text_field>`.
+    Figure 1: Typing *a simple example* in widget :ref:`Text Field`.
     
 This widget's simplicity makes it most adequate for pedagogic purposes. Later,
 we will discover other, more powerful ways of importing strings.
 
-The :doc:`Display <display>` widget can be used to visualize the details
+The :ref:`Display` widget can be used to visualize the details
 of a segmentation. By default, it shows the segmentation's label followed by
 each successive segment's address and content. A segmentation sent by a
-:doc:`Text Field <text_field>` instance will contain a single segment
+:ref:`Text Field` instance will contain a single segment
 covering the whole string (see :ref:`figure 2
 <keyboard_input_segmentation_fig2>` below).
 
     :alt: Example usage of widget Display
     :figclass: align-center
 
-    Figure 2: Viewing *a simple example* in widget :doc:`Display <display>`.
+    Figure 2: Viewing *a simple example* in widget :ref:`Display`.
     
-By default, :doc:`Display <display>` passes its input data without
+By default, :ref:`Display` passes its input data without
 modification to its output connexions. It is very useful for viewing
 intermediate results in an Orange Textable scheme and making sure that other
 widgets process data as expected.

docs/rst/length.rst

+.. _Length:
+
 Length
 ======
 

docs/rst/merge.rst

+.. _Merge:
+
 Merge
 =====
 

docs/rst/merging_segmentations_together.rst

 Computerized text analysis often implies consolidating various text sources
 into a single *corpus*. In the framework of Orange Textable, this amounts
 to grouping segmentations together, and it is the purpose of the
-:doc:`Merge <merge>` widget.
+:ref:`Merge` widget.
 
 To try out this widget, create on the canvas two instances of
-:doc:`Text Field <text_field>`, an instance of :doc:`Merge <merge>` and an
-instance of :doc:`Display <display>` (see
+:ref:`Text Field`, an instance of :ref:`Merge` and an
+instance of :ref:`Display` (see
 :ref:`figure 1 <merging_segmentations_together_fig1>` below). Type
-a different string in each :doc:`Text Field <text_field>` instance (e.g.
+a different string in each :ref:`Text Field` instance (e.g.
 *a simple example* and *another example*) and assign it a distinct label (e.g.
 *text_string* and *text_string2*). Eventually, connect the instances as
 shown on :ref:`figure 1 <merging_segmentations_together_fig1>`.
     :alt: Scheme illustrating the usage of widget Merge
     :figclass: align-center
 
-    Figure 1: Grouping *a simple example* with *another example* using widget :doc:`Merge <merge>`.
+    Figure 1: Grouping *a simple example* with *another example* using widget :ref:`Merge`.
 
-The interface of widget :doc:`Merge <merge>` (see
+The interface of widget :ref:`Merge` (see
 :ref:`figure 2 <merging_segmentations_together_fig2>` below) illustrates a
 feature shared by most Orange Textable widgets: the **Advanced settings**
 checkbox triggers the display of more complex controls offering more
     :alt: Interface of widget merge
     :figclass: align-center
 
-    Figure 2: Interface of widget :doc:`Merge <merge>`.
+    Figure 2: Interface of widget :ref:`Merge`.
     
 Section **Ordering** of the widget's interface lets the user view the labels
 of incoming segmentations and control the order in which they will appear in
 
 :ref:`Figure 3 <merging_segmentations_together_fig3>` above shows the
 resulting merged segmentation, as displayed by widget
-:doc:`Display <display>`. As can be seen, :doc:`Merge <merge>` makes it easy
+:ref:`Display`. As can be seen, :ref:`Merge` makes it easy
 to concatenate several strings into a single segmentation. If the incoming
 segmentations contained several segments, each of them would appear in the
 output segmentation, in the order specified under **Ordering** (and, within
 each incoming segmentation, in the original order of segments).
 
-.. _merging_segmentations_together_ex1:
+.. _merging_segmentations_together_ex:
 
-**Exercise 1:** Can you add a new instance of :doc:`Merge <merge>` to the
+**Exercise:** Can you add a new instance of :ref:`Merge` to the
 scheme illustrated on :ref:`figure 1 <merging_segmentations_together_fig1>`
 above and modify the connections (but not the configuration of existing
 widgets) so that the segmentation given in
 :ref:`figure 4 <merging_segmentations_together_fig1>` below appears in the
-:doc:`Display <display>` widget?
-(:ref:`solution <solution_merging_segmentations_together_ex1>`)
+:ref:`Display` widget?
+(:ref:`solution <solution_merging_segmentations_together_ex>`)
 
 .. _merging_segmentations_together_fig4:
 
     :alt: 3 segments: "a simple example", "another example", "another example"
     :figclass: align-center
 
-    Figure 4: The segmentation requested in :ref:`exercise 1 <merging_segmentations_together_ex1>`.
+    Figure 4: The segmentation requested in the :ref:`exercise <merging_segmentations_together_ex>`.
 
-.. _solution_merging_segmentations_together_ex1:
+.. _solution_merging_segmentations_together_ex:
 
-**Solution to exercise 1:** (:ref:`back to the exercise <merging_segmentations_together_ex1>`)
+**Solution:** (:ref:`back to the exercise <merging_segmentations_together_ex>`)
 
 .. figure:: figures/solution_exercise_merge.png
     :align: center
     :alt: New Merge widget takes input from old one and Text field, and sends output to Display
     :figclass: align-center
+    :scale: 80 %
 
-    Figure 5: Solution to :ref:`exercise 1 <merging_segmentations_together_ex1>`.
+    Figure 5: Solution to the :ref:`exercise <merging_segmentations_together_ex>`.
 

docs/rst/partitioning_segmentations.rst

 There are many situations where we might want so selectively in- or exclude
 segments from a segmentation. For instance, a user might be want to exclude
 from a word segmentation all those that are less than 4 letters long. The
-:doc:`Select <select>` widget is tailored for such tasks.
+:ref:`Select` widget is tailored for such tasks.
 
 The widget's interface (see :ref:`figure 1 <partitioning_segmentations_fig1>`
 below) offers a choice between two modes: *Include* and *Exclude*. Depending
     :alt: Example usage of widget Select
     :figclass: align-center
 
-    Figure 1: Excluding short words with widget :doc:`Select <select>`.
+    Figure 1: Excluding short words with widget :ref:`Select`.
 
 In the example of :ref:`figure 1 <partitioning_segmentations_fig1>`, the
 widget is configured to exclude all incoming segments containing no more than
 anchors (*^* and *$*), all words containing *at least* a sequence of 1 to 3
 letters--i.e. all the words--would be excluded.
 
-Note that :doc:`Select <select>` automatically emits a second segmentation
+Note that :ref:`Select` automatically emits a second segmentation
 containing all the segments that have been discarded from the main output
 segmentation (in the case of :ref:`figure 1 <partitioning_segmentations_fig1>`
 above, that would be all words less than 4 letters long). This feature is
 useful when both the selected *and* the discarded segments are to be further
-processed on distinct branches. By default, when :doc:`Select <select>` is connected to another widget, the
+processed on distinct branches. By default, when :ref:`Select` is connected to another widget, the
 main segmentation is being emitted. In order to send the segmentation of
 discarded segments instead, right-click on the outgoing connexion and select
 **Reset Signals** (see :ref:`figure 2 <partitioning_segmentations_fig2>`

docs/rst/preprocess.rst

+.. _Preprocess:
+
 Preprocess
 ==========
 

docs/rst/recode.rst

+.. _Recode:
+
 Recode
 ======
 

docs/rst/segment.rst

+.. _Segment:
+
 Segment
 =======
 

docs/rst/segmentation_processing_widgets.rst

 ===============================
 
 Widgets of this category take *Segmentation* data in input and emit data of
-the same type. Some of them (:doc:`Preprocess <preprocess>` and
-:doc:`Recode <recode>`) generate modified text data. Others
-(:doc:`Merge <merge>`, :doc:`Segment <segment>`, :doc:`Select <select>`,
-:doc:`Intersect <intersect>` and :doc:`Extract XML <extract_xml>`) do not
-generate new text data but only new *Segmentation* data.
-:doc:`Display <display>`, finally, is mainly used to visualize the details of
-a given *Segmentation* object (content and address of segments, as well as
-their possible annotations).
+the same type. Some of them (:ref:`Preprocess` and :ref:`Recode`) generate
+modified text data. Others (:ref:`Merge`, :ref:`Segment`, :ref:`Select`,
+:ref:`Intersect` and :ref:`Extract XML`) do not generate new text data but
+only new *Segmentation* data. :ref:`Display`, finally, is mainly used to
+visualize the details of a given *Segmentation* object (content and address of
+segments, as well as their possible annotations).
 
 .. toctree::
     :maxdepth: 1

docs/rst/segmentations_tables.rst

 From segmentations to tables
 ============================
 
-In preparation.
+The main purpose of Orange Textable is to build tables based on texts. Central
+to this process are the segmentations we have learned to create and manipulate
+earlier. Indeed, Orange Textable provides a number of
+:doc:`widgets for table construction <table_construction_widgets>`, and they
+all operate on the basis of one or more segmentations.
+
+For the time being, we will focus on the construction of frequency tables,
+which are very common in computerized text analysis and which will serve as
+introduction to other types of tables. For the sake of simplicity, consider
+first the segmentation of *a simple example* into letters. Counting the
+frequency of each letter type yields a table such as the following:
+
+.. _segmentations_tables_table1:
+
+.. csv-table:: Table 1: Frequency of letter types.
+    :header: *a*, *s*, *i*, *m*, *p*, *l*, *e*, *x*
+    :stub-columns: 0
+    :widths: 3 3 3 3 3 3 3 3
+
+    2,   1,   1,   2,   2,   2,   3,   2
+
+More often, we will be interested in comparing frequency across several
+*contexts*. For instance, if the word segmentation of *a simple example* is
+also available, it may be used together with the letter segmentation to
+produce a so-called *contingency table* (or *document--term matrix*):
+
+.. _segmentations_tables_table2:
+
+.. csv-table:: Table 2: Frequency of letters within words.
+    :header: "", *a*, *s*, *i*, *m*, *p*, *l*, *e*, *x*
+    :stub-columns: 1
+    :widths: 10 3 3 3 3 3 3 3 3
+
+    *a*,       1,   0,   0,   0,   0,   0,   0,   0
+    *simple*,  0,   1,   1,   1,   1,   1,   1,   0
+    *example*, 1,   0,   0,   1,   1,   1,   2,   1
+
+In a real application, rows could correspond to the writings of an author and
+columns to selected prepositions, for instance. The general idea is to
+determine the number of occurrences of various *units* in various *contexts*.
+Such data can then further analyzed by means of a statistical test (aiming
+at answering the question "does the distribution of units depend on contexts")
+or a graphical representation (making it possible to visualize the attraction
+or repulsion between specific units and contexts).
 

docs/rst/segmenting_data_smaller_units.rst

 inverse operation: create a segmentation whose segments are *parts* of another
 segmentation's segments. Typically, we will be segmenting strings into words,
 characters, or any kind of text units that will be later counted, measured,
-and so on. This is precisely the purpose of widget :doc:`Segment <segment>`.
+and so on. This is precisely the purpose of widget :ref:`Segment`.
 
-To try it out, create a new scheme with an instance of
-:doc:`Text Field <text_field>` connected to an instance of
-:doc:`Segment <segment>`, itself connected to an instance of
-:doc:`Display <display>` (see
-:ref:`figure 1 <segmenting_data_smaller_units_fig1>` below). In what follows,
-we will suppose that the string typed in :doc:`Text Field <text_field>` is
-*a simple example*.
+To try it out, create a new scheme with an instance of :ref:`Text Field`
+connected to an instance of :ref:`Segment`, itself connected to an instance of
+:ref:`Display` (see :ref:`figure 1 <segmenting_data_smaller_units_fig1>`
+below). In what follows, we will suppose that the string typed in
+:ref:`Text Field` is *a simple example*.
 
 .. _segmenting_data_smaller_units_fig1:
 
     :alt: Scheme illustrating the usage of widget Segment
     :figclass: align-center
 
-    Figure 1: A scheme for testing the :doc:`Segment <segment>` widget
+    Figure 1: A scheme for testing the :ref:`Segment` widget
     
 In its basic form (i.e. with **Advanced settings** unchecked, see
 :ref:`figure 2 <segmenting_data_smaller_units_fig2>` below),
-:doc:`Segment <segment>` takes a single parameter (aside from the
+:ref:`Segment` takes a single parameter (aside from the
 **Output segmentation label**), namely a regex. The widget then looks for all
 matches of the regex pattern in each successive input segment, and creates for
 every match a new segment in the output segmentation.
     :alt: Interface of widget Segment configured with regex "\w+"
     :figclass: align-center
 
-    Figure 2: Interface of the :doc:`Segment <segment>` widget, configured for word segmentation
+    Figure 2: Interface of the :ref:`Segment` widget, configured for word segmentation
 
 For instance, the regex *\\w+* divides each incoming segment into sequences of
 alphanumeric character (and underscore)--which in our case amounts to

docs/rst/select.rst

+.. _Select:
+
 Select
 ======
 

docs/rst/strings_segments_segmentations.rst

 The main purpose of Orange Textable is to build tables based on text strings.
 As we will see, there are several methods for importing text strings, the
 simplest of which is keyboard input using widget
-:doc:`Text Field <text_field>` (see also :doc:`Keyboard input and segmentation
+:ref:`Text Field` (see also :doc:`Keyboard input and segmentation
 display <keyboard_input_segmentation_display>`). Whenever a new string is
 imported, it is assigned a unique identification number (called
 *string index*) and stays in memory as long as the widget that imported it.

docs/rst/table_construction_widgets.rst

 
 Widgets of this category take *Segmentation* data in input and emit *Table*
 data. They are thus ultimately responsible for converting text to tables,
-either by counting items (:doc:`Count <count>`), by measuring their length
-(:doc:`Length <length>`), by quantifying their diversity
-(:doc:`Variety <variety>`), or by exploiting the annotations associated with
-them (:doc:`Annotation <annotation>`). Finally, :doc:`Context <context>` makes
-it possible to build concordances and collocation lists.
+either by counting items (:ref:`Count`), by measuring their length
+(:ref:`Length`), by quantifying their diversity (:ref:`Variety`), or by
+exploiting the annotations associated with them
+(:ref:`Annotation`). Finally, widget :ref:`Context` makes it possible to build
+concordances and collocation lists.
 
 .. toctree::
     :maxdepth: 1

docs/rst/table_conversion_export_widget.rst

 Table conversion/export widget
 ==============================
 
-The only widget in this category, :doc:`Convert <convert>`, takes *Table* data
+The only widget in this category, :ref:`Convert`, takes *Table* data
 in input and emits *ExampleTable* data for further processing with Orange
 Canvas. It also makes it possible to apply various standard transforms to a
 table, such as sorting, normalizing, etc., as well as to export its contents

docs/rst/tables.rst

 Tables
 ======
 
-Segmentations are to tables what a means is to an end.
+Segmentations are to tables what a means is to an end. In this section, you
+will learn how to go from the ones to the others.
 
 .. toctree::
     :maxdepth: 1

docs/rst/text_field.rst

+.. _Text Field:
+
 Text Field
 ==========
 

docs/rst/text_files.rst

+.. _Text Files:
+
 Text Files
 ==========
 

docs/rst/text_import_widgets.rst

 
 Widgets of this category take no input and emit *Segmentation* data. Their
 purpose is to import text data in Orange Canvas, either from the keyboard
-(:doc:`Text Field <text_field>`), from files (:doc:`Text Files <text_files>`),
-or from the Internet (:doc:`URLs <urls>`).
+(:ref:`Text Field`), from files (:ref:`Text Files`), or from the Internet
+(:ref:`URLs`).
 
 .. toctree::
     :maxdepth: 1

docs/rst/urls.rst

+.. _URLs:
+
 URLs
 ====
 

docs/rst/using_segmentation_filter_another.rst

 ======================================
 
 In some cases, the number of forms to be selectively included in or excluded
-from a segmentation is too large for using the :doc:`Select <select>` widget.
+from a segmentation is too large for using the :ref:`Select` widget.
 A typical example is the removal of "stopwords" from a text: in English for
 instance, although the list of such words is finite, it is too long to try
 to encode it by means of a regex (cf. `an example of such a list
 <http://members.unine.ch/jacques.savoy/clef/englishST.txt>`_).
 
-The purpose of widget :doc:`Intersect <intersect>` is precisely to solve that
+The purpose of widget :ref:`Intersect` is precisely to solve that
 kind of problem. It takes two segmentations in input and lets the user include
 in or exclude from the first (*source*) segmentation those segments whose
 content is the same as that of a segment in the second (*filter*)
     :alt: Interface of widget Intersect configured for stopword removal
     :figclass: align-center
 
-    Figure 1: Interface of widget :doc:`Intersect <intersect>` configured for stopword removal.
+    Figure 1: Interface of widget :ref:`Intersect` configured for stopword removal.
     
-Similarly to widget :doc:`Select <select>`, user must choose between modes
+Similarly to widget :ref:`Select`, user must choose between modes
 **Include** and **Exclude**. The next step is to specify which incoming
 segmentation plays the role of the **Source** segmentation and the **Filter**
 segmentation. (Here again, we will ignore the **Annotation key** option for
 
 In order to try out the widget, set up a scheme similar to the one shown on
 :ref:`figure 2 <using_segmentation_filter_another_fig2>` below). The first
-instance of :doc:`Text Field <text_field>` contains the text to process (for
+instance of :ref:`Text Field` contains the text to process (for
 instance the
 `Universal Declaration of Human Rights <http://www.un.org/en/documents/udhr/>`_),
 while the second instance, *Text Field (1)*, contains the list of English
-stopwords mentioned above. Both instances of :doc:`Segment <segment>` produce
+stopwords mentioned above. Both instances of :ref:`Segment` produce
 a word segmentation with regex *\\w+*; the only difference in their
 configuration is the output segmentation label , i.e. *words* for *Segment*
 and *stopwords* for *Segment (1)*. Finally, the instance of
-:doc:`Intersect <intersect>` is configured as shown on
+:ref:`Intersect` is configured as shown on
 :ref:`figure 1 <using_segmentation_filter_another_fig1>` above.
 
 .. _using_segmentation_filter_another_fig2:
     :alt: Scheme illustrating the use of the Intersect widget for stopword removal
     :figclass: align-center
 
-    Figure 2: Example scheme for removing stopword using widget :doc:`Intersect <intersect>` .
+    Figure 2: Example scheme for removing stopword using widget :ref:`Intersect` .
 
 The content of the first segments of the resulting segmentation is:
 
     *world*
     ...
 
+.. _using_segmentation_filter_another_ex:
+
+**Exercise:** Based on an instance of :ref:`Text Field`, produce
+a segmentation containing all words less than 4 letters long that appear at
+the beginning of each line, excluding *I, you, he, she, we*.
+(:ref:`solution <solution_using_segmentation_filter_another_ex>`)
+
+.. _solution_using_segmentation_filter_another_ex:
+
+**Solution:**
+
+:ref:`Figure 3 <using_segmentation_filter_another_fig3>` below shows a possible
+solution. The 4 instances in the lower part of the scheme (*Text Field (1)*,
+*Segment (1)*, *Intersect*, and *Display*) are configured as in
+:ref:`figure 2 <using_segmentation_filter_another_fig2>` above--with
+*Text Field (1)* containing the list of pronouns to exclude.
+
+The difference lies in the addition of a :ref:`Segment` instance in
+the upper branch. In this branch, the first instance (*Segment*) produces a
+segmentation into lines with regex *.+* while *Segment (2)* extracts the first
+word of each line, provided it is shorter than 4 letters
+(regex *^\\w{1,3}\\b\*)*. *Intersect* eventually takes care of excluding the
+pronouns listed above.
+
+.. _using_segmentation_filter_another_fig3:
+
+.. figure:: figures/solution_exercise_intersect.png
+    :align: center
+    :alt: Solution to the exercise illustrating the Intersect widget
+    :figclass: align-center
+
+    Figure 3: A possible solution.
+
+(:ref:`back to the exercise <using_segmentation_filter_another_ex>`)
 

docs/rst/variety.rst

+.. _Variety:
+
 Variety
 =======
 
Add a comment to this file

docs/user_guide/figures/Thumbs.db

Binary file modified.

 #!/usr/bin/env python
 
 #=============================================================================
-# File setup.py, v0.07
+# File setup.py, v0.08
 # Copyright 2012-2013 LangTech Sarl (info@langtech.ch)
 #=============================================================================
 # This file is part of the Textable (v1.3) extension to Orange Canvas.
 NAME = 'Orange-Textable'
 DOCUMENTATION_NAME = 'Orange Textable'
 
-VERSION = '1.3a3'
+VERSION = '1.3a4'
 
 DESCRIPTION = 'Orange Textable add-on for Orange data mining software package.'
 LONG_DESCRIPTION = open(os.path.join(os.path.dirname(__file__), 'README.rst')).read()
     'orange.widgets': (
         'Textable = _textable.widgets',
     ),
+    'orange.canvas.help': (
+        'intersphinx = _textable:doc_root'
+    ),
 }
 
 if __name__ == '__main__':
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.