Installation

Clone the repository to your system. This event detection software uses python 3.7 or newer. Inside the directory that is cloned you will need to run the following to install the necessary dependencies:

pip install -r requirements.txt

Usage

Once the install has finished you will need to start up a python interactive environment:

python

Once the interactive environment is running you will need to:

import detection_app

This will take a little while to load but once it does you can proceed to run stance detection on data in a number of ways (Please note for the stance processing from file functions, any large input files will take quite a while to process):

The first way is the simplest and can be run on any size text. It is meant to test specific sentences or small text to see what stances they will produce, it can be run by the following:

detection_app.test("This is a sentence to run", "covid")

The next 3 ways are a bit more involved and make use of files of data to do stance detection. Each of the functions described below will output a file to a directory named "user_provided_stance_output" which the software will create if it does not already exist. The output files will have a .jsonl extension because each line in the output file will be the stances for each line in the input file.

Detecting stances from a text file
The .txt file provided must have a single text per line. For example one line could read "I wear masks to protect me." (Note the quotations do not have to be present in the .txt file). Also, the domain for which you want to detect stances is specified in the first parameter (e.g. "covid").
```
detection_app.text_to_stances("covid", "path/to/input/file", "data_description", 0)
```
- The second parameter will be used to build the output files name so the user can provided a description that helps keep track of the output files and the data they are created from. If the data description is not specified the output file will be called user_provided_text_stances.jsonl. NOTE: That if your run this function more than once without specifying the description it will always overwrite the previous file named user_provided_text_stances.jsonl and fill it with the most recent output.
- The last parameter is the amount of lines in the input file that the user wants to process. If the user specifies 0 or does not provide the last parameter the function will default to processing the entirety of the data in the input file.

Detect stances from a json file
User provides a path to a json file for stance processing. The extension does not have to be json, but each line must be a single json structure that has an attribute for the text to process, some kind of author identifier, a timestamp, and some kind of document identifier. For example if the json represented a tweet there should be the author id, timestamp of the tweet, and the id of the tweet. User must provide the name of each of these attributes that is found in each json structure. Also, the domain for which you want to detect stances is specified in the first parameter (e.g. "covid").
```
detection_app.json_to_stances("covid", "path/to/input/file", "text_attribute", "author_attribute", "timestamp_attribute", "doc_id_attribute", "data_description", 0)
```
- NOTE: If one of the attributes is nested (i.e. author attribute is {"user": {"id": 12345678}} then supply the attibute, in appropriate order, comma separated, "user,id".
- Data description appends the provided text to the output file name as described above. NOTE: the same warning as above applies here.
- Again, the last parameter is the number of lines to process, providing 0 or not having that parameter at all will process the whole input data file.

Detect stances from a json file. User provides a path to a csv file for stance processing. The file must have a header row with a label for the text to process, some kind of author identifier, a timestamp, and some kind of document identifier. For example if each csv line represented a tweet there would be headers for text of the tweet, the author id, timestamp of the tweet, and the id of the tweet. User must provide the name of each of these labels that is found in the csv file. Also, the domain for which you want to detect stances is specified in the first parameter (e.g. "covid").
```
detection_app.csv_to_stances("covid", "path/to/input/file", "text_label", "author_label", "timestamp_label", "doc_id_label", "data_description", 0)
```
- NOTE: If one of the items requires the combinations of two csv columns (i.e. the timestamp is made up from a date column and a time column) then supply both labels, in appropriate order pipe (|) separated, "date|time".
- Data description appends the provided text to the output file name as described above. NOTE: the same warning as above applies here.
- Again, the last parameter is the number of lines to process, providing 0 or not having that parameter at all will process the whole input data file.

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
resource_building_tool		resource_building_tool
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
ARLIS_contents.xlsx		ARLIS_contents.xlsx
ARLIS_triggers.xlsx		ARLIS_triggers.xlsx
Content Buckets.xlsx		Content Buckets.xlsx
Content Buckets_maskOnly.xlsx		Content Buckets_maskOnly.xlsx
Dockerfile		Dockerfile
ModalityLexiconSubcatTagsPITT.xlsx		ModalityLexiconSubcatTagsPITT.xlsx
README.md		README.md
TriggerBuckets.xlsx		TriggerBuckets.xlsx
afghanistan_contents.xlsx		afghanistan_contents.xlsx
afghanistan_negotiation_contents.xlsx		afghanistan_negotiation_contents.xlsx
afghanistan_negotiation_triggers.xlsx		afghanistan_negotiation_triggers.xlsx
afghanistan_triggers.xlsx		afghanistan_triggers.xlsx
afghanistan_withdrawal_contents.xlsx		afghanistan_withdrawal_contents.xlsx
afghanistan_withdrawal_triggers.xlsx		afghanistan_withdrawal_triggers.xlsx
app.py		app.py
ask_detection.py		ask_detection.py
ask_mappings.py		ask_mappings.py
belief_type_by_month_stats.py		belief_type_by_month_stats.py
catvar.txt		catvar.txt
catvar_v_alternates.py		catvar_v_alternates.py
cluster_users_by_stance.py		cluster_users_by_stance.py
cluster_users_by_untargeted_sentiment.py		cluster_users_by_untargeted_sentiment.py
convert_resource_json_to_excel.py		convert_resource_json_to_excel.py
detection_app.py		detection_app.py
dev_LCS.txt		dev_LCS.txt
isis_brian_contents.xlsx		isis_brian_contents.xlsx
isis_brian_triggers.xlsx		isis_brian_triggers.xlsx
isis_content.xlsx		isis_content.xlsx
isis_triggers.xlsx		isis_triggers.xlsx
load_resources.py		load_resources.py
nltk.txt		nltk.txt
reporting_verbs.txt		reporting_verbs.txt
requirements.txt		requirements.txt
resource_building.py		resource_building.py
stance_detection.py		stance_detection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Usage

About

Releases

Packages

Contributors 3

Languages

ihmc/fine_grained_stance_detection

Folders and files

Latest commit

History

Repository files navigation

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages