F.A.Q.

What are annotations?

They are objects just like entries. Senpy ships with several default annotations, including Sentiment and Emotion.

What’s a plugin made of?

When receiving a query, senpy selects what plugin or plugins should process each entry, and in what order. It also makes sure the every entry and the parameters provided by the user meet the plugin requirements.

Hence, two parts are necessary: 1) the code that will process the entry, and 2) some attributes and metadata that will tell senpy how to interact with the plugin.

In practice, this is what a plugin looks like, tests included:

#
#    Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
#    Licensed under the Apache License, Version 2.0 (the "License");
#    you may not use this file except in compliance with the License.
#    You may obtain a copy of the License at
#
#        http://www.apache.org/licenses/LICENSE-2.0
#
#    Unless required by applicable law or agreed to in writing, software
#    distributed under the License is distributed on an "AS IS" BASIS,
#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#    See the License for the specific language governing permissions and
#    limitations under the License.
#

import random
from senpy import SentimentPlugin, Sentiment, Entry


class RandSent(SentimentPlugin):
    '''A sample plugin that returns a random sentiment annotation'''
    name = 'sentiment-random'
    author = "@balkian"
    version = '0.1'
    url = "https://github.com/gsi-upm/senpy-plugins-community"
    marl__maxPolarityValue = '1'
    marl__minPolarityValue = "-1"

    def analyse_entry(self, entry, activity):
        polarity_value = max(-1, min(1, random.gauss(0.2, 0.2)))
        polarity = "marl:Neutral"
        if polarity_value > 0:
            polarity = "marl:Positive"
        elif polarity_value < 0:
            polarity = "marl:Negative"
        sentiment = Sentiment(marl__hasPolarity=polarity,
                              marl__polarityValue=polarity_value)
        sentiment.prov(activity)
        entry.sentiments.append(sentiment)
        yield entry

    def test(self):
        '''Run several random analyses.'''
        params = dict()
        results = list()
        for i in range(50):
            activity = self.activity(params)
            res = next(self.analyse_entry(Entry(nif__isString="Hello"),
                                          activity))
            res.validate()
            results.append(res.sentiments[0]['marl:hasPolarity'])
        assert 'marl:Positive' in results
        assert 'marl:Negative' in results

The lines highlighted contain some information about the plugin. In particular, the following information is mandatory:

  • A unique name for the class. In our example, sentiment-random.

  • The subclass/type of plugin. This is typically either SentimentPlugin or EmotionPlugin. However, new types of plugin can be created for different annotations. The only requirement is that these new types inherit from senpy.Analysis

  • A description of the plugin. This can be done simply by adding a doc to the class.

  • A version, which should get updated.

  • An author name.

How does senpy find modules?

Senpy looks for files of two types:

  • Python files of the form senpy_<NAME>.py or <NAME>_plugin.py. In these files, it will look for: 1) Instances that inherit from senpy.Plugin, or subclasses of senpy.Plugin that can be initialized without a configuration file. i.e. classes that contain all the required attributes for a plugin.

  • Plugin definition files (see Advanced plugin definition)

How can I define additional parameters for my plugin?

Your plugin may ask for additional parameters from users by using the attribute extra_params in your plugin definition. It takes a dictionary, where the keys are the name of the argument/parameter, and the value has the following fields:

  • aliases: the different names which can be used in the request to use the parameter.

  • required: if set to true, users need to provide this parameter unless a default is set.

  • options: the different acceptable values of the parameter (i.e. an enum). If set, the value provided must match one of the options.

  • default: the default value of the parameter, if none is provided in the request.

"extra_params":{
   "language": {
      "aliases": ["language", "lang", "l"],
      "required": True,
      "options": ["es", "en"],
      "default": "es"
      }
   }

How should I load external data and files

Most plugins will need access to files (dictionaries, lexicons, etc.). These files are usually heavy or under a license that does not allow redistribution. For this reason, senpy has a data_folder that is separated from the source files. The location of this folder is controlled programmatically or by setting the SENPY_DATA environment variable. You can use the self.path(filepath) function to get the path of a given filepath within the data folder.

Plugins have a convenience function self.open which will automatically look for the file if it exists, or open a new one if it doesn’t:

import os


class PluginWithResources(AnalysisPlugin):
    file_in_data = <FILE PATH>
    file_in_sources = <FILE PATH>

    def on activate(self):
        with self.open(self.file_in_data) as f:
            self._classifier = train_from_file(f)
        file_in_source = os.path.join(self.get_folder(), self.file_in_sources)
        with self.open(file_in_source) as f:
            pass

It is good practice to specify the paths of these files in the plugin configuration, so the same code can be reused with different resources.

Can I build a docker image for my plugin?

Add the following dockerfile to your project to generate a docker image with your plugin:

FROM gsiupm/senpy

Once you make sure your plugin works with a specific version of senpy, modify that file to make sure your build will work even if senpy gets updated. e.g.:

FROM gsiupm/senpy:1.0.1

This will copy your source folder to the image, and install all dependencies. Now, to build an image:

docker build . -t gsiupm/exampleplugin

And you can run it with:

docker run -p 5000:5000 gsiupm/exampleplugin

If the plugin uses non-source files (How should I load external data and files), the recommended way is to use SENPY_DATA folder. Data can then be mounted in the container or added to the image. The former is recommended for open source plugins with licensed resources, whereas the latter is the most convenient and can be used for private images.

Mounting data:

docker run -v $PWD/data:/data gsiupm/exampleplugin

Adding data to the image:

FROM gsiupm/senpy:1.0.1
COPY data /

What annotations can I use?

You can add almost any annotation to an entry. The most common use cases are covered in the API and vocabularies.

Why does the analyse function yield instead of return?

This is so that plugins may add new entries to the response or filter some of them. For instance, a chunker may split one entry into several. On the other hand, a conversion plugin may leave out those entries that do not contain relevant information.

If I’m using a classifier, where should I train it?

Training a classifier can be time time consuming. To avoid running the training unnecessarily, you can use ShelfMixin to store the classifier. For instance:

from senpy.plugins import ShelfMixin, AnalysisPlugin

class MyPlugin(ShelfMixin, AnalysisPlugin):
    def train(self):
        ''' Code to train the classifier
        '''
        # Here goes the code
        # ...
        return classifier

    def activate(self):
        if 'classifier' not in self.sh:
            classifier = self.train()
            self.sh['classifier'] = classifier
        self.classifier = self.sh['classifier']

    def deactivate(self):
        self.close()

By default the ShelfMixin creates a file based on the plugin name and stores it in that plugin’s folder. However, you can manually specify a ‘shelf_file’ in your .senpy file.

Shelves may get corrupted if the plugin exists unexpectedly. A corrupt shelf prevents the plugin from loading. If you do not care about the data in the shelf, you can force your plugin to remove the corrupted file and load anyway, set the ‘force_shelf’ to True in your plugin and start it again.

How can I turn an external service into a plugin?

This example ilustrate how to implement a plugin that accesses the Sentiment140 service.

class Sentiment140Plugin(SentimentPlugin):
    def analyse_entry(self, entry, params):
        text = entry.text
        lang = params.get("language", "auto")
        res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
                            json.dumps({"language": lang,
                                        "data": [{"text": text}]
                                        }
                                       )
                            )

        p = params.get("prefix", None)
        polarity_value = self.maxPolarityValue*int(res.json()["data"][0]
                                                   ["polarity"]) * 0.25
        polarity = "marl:Neutral"
        neutral_value = self.maxPolarityValue / 2.0
        if polarity_value > neutral_value:
            polarity = "marl:Positive"
        elif polarity_value < neutral_value:
            polarity = "marl:Negative"

        sentiment = Sentiment(id="Sentiment0",
                            prefix=p,
                            marl__hasPolarity=polarity,
                            marl__polarityValue=polarity_value)
        sentiment.prov(self)
        entry.sentiments.append(sentiment)
        yield entry

How can I activate a DEBUG mode for my plugin?

You can activate the DEBUG mode by the command-line tool using the option -d.

senpy -d

Additionally, with the --pdb option you will be dropped into a pdb post mortem shell if an exception is raised.

python -m pdb yourplugin.py

Where can I find more code examples?

See: http://github.com/gsi-upm/senpy-plugins-community.