ssp.snorkel


class ssp.snorkel.labelling_function.SSPLabelEvaluator(text_column='text', label_column='label', raw_tweet_table_name_prefix='raw_tweet_dataset', postgresql_host='localhost', postgresql_port='5432', postgresql_database='sparkstreamingdb', postgresql_user='sparkstreaming', postgresql_password='sparkstreaming')[source]

Bases: ssp.posgress.dataset_base.PostgresqlDatasetBase

run_labeler(version=0)[source]
class ssp.snorkel.labelling_function.SSPTweetLabeller(input_col='text', output_col='slabel')[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Snorkel Transformer uses LFs to train a Label Model, that can annotate AI text and non AI text :param input_col: Name of the input text column if Dataframe is used :param output_col: Name of the ouput label column if Dataframe is used

ABSTAIN = -1
NEGATIVE = 0
POSITIVE = 1
static bigram_check(x, word1, word2)[source]
evaluate(X, y)[source]
fit(X, y=None)[source]
Parameters
  • X – (Dataframe) / (List) Input text

  • y – None

Returns

Numpy Array [num of samples, num of LF functions]

is_ai_tweet = LabelingFunction is_ai_tweet, Preprocessors: []
is_not_ai_tweet = LabelingFunction is_not_ai_tweet, Preprocessors: []
normalize_prob(res)[source]
not_ai = LabelingFunction not_ai, Preprocessors: []
not_big_data = LabelingFunction not_big_data, Preprocessors: []
not_cv = LabelingFunction not_cv, Preprocessors: []
not_data_science = LabelingFunction not_data_science, Preprocessors: []
not_neural_network = LabelingFunction not_neural_network, Preprocessors: []
not_nlp = LabelingFunction not_nlp, Preprocessors: []
predict(X)[source]
transform(X, y=None)[source]