This article will explore the challenges and opportunities of deploying a large BERT Question Answering Transformer model(bert-large-uncased-whole-word-masking-finetuned-squad) from inside Huggingface, where RedisGears and RedisAI perform heavy lifting while leveraging in-memory datastore Redis.


Why do we need RedisAI?


Some numbers for inspiration and why to read this article:

python3 transformers_plain_bert_qa.py \nairborne transmission of respiratory infections is the lack of established methods for the detection of airborne respiratory microorganisms\n10.351818372 seconds\n
time curl -i -H "Content-Type: application/json" -X POST -d '{"search":"Who performs viral transmission among adults"}' http://localhost:8080/qasearch\n\nreal\t0m0.747s\nuser\t0m0.004s\nsys\t0m0.000s\n\n



BERT Question Answering inference works where the ML model selects an answer from the given text. In other words, BERT QA "thinks" through the following: "What is the answer from the text, assuming the answer to the question exists within the paragraph selected."


So it's important to select text potentially containing an answer. A typical pattern is to use Wikipedia data to build Open Domain Question Answering.


Our QA system is a medical domain-specific question/answering pipeline, hence we need a first pipeline that turns data into a knowledge graph. This NLP pipeline is available at Redis LaunchPad, is fully open source, and is described in a previous article. Here is a 5 minute video describing it, and below you will find an architectural overview:




BERT Question Answering pipeline and API


In the BERT QA pipeline (or in any other modern NLP inference task), there are two steps:

  1. Tokenize text - turn text into numbers
  3. Run the inference - large matrix multiplication
With Redis, we have the opportunity to pre-compute everything and store it in memory, but how do we do it? Unlike with the summarization ML learning task, the question is not known in advance, so we can't pre-compute all possible answers. However, we can pre-tokenize all potential answers (i.e. all paragraphs in the dataset) using RedisGears:

def parse_sentence(record):\n    import redisAI\n    import numpy as np\n    global tokenizer\n    if not tokenizer:\n        tokenizer=loadTokeniser()\n    hash_tag="{%s}" % hashtag()\n\n    for idx, value in sorted(record['value'].items(), key=lambda item: int(item[0])):\n        tokens = tokenizer.encode(value, add_special_tokens=False, max_length=511, truncation=True, return_tensors="np")\n        tokens = np.append(tokens,tokenizer.sep_token_id).astype(np.int64)\n        tensor=redisAI.createTensorFromBlob('INT64', tokens.shape, tokens.tobytes())\n\n        key_prefix='sentence:'\n        sentence_key=remove_prefix(record['key'],key_prefix)\n        token_key = f"tokenized:bert:qa:{sentence_key}:{idx}"\n        redisAI.setTensorInKey(token_key, tensor)\n        execute('SADD',f'processed_docs_stage3_tokenized{hash_tag}', token_key)\n\n

See the full code on GitHub.


Then for each Redis Cluster shard, we pre-load the BERT QA model by downloading, exporting it into torchscript, then loading it into each shard:

def load_bert():\n    model_file = 'traced_bert_qa.pt'\n\n    with open(model_file, 'rb') as f:\n        model = f.read()\n    startup_nodes = [{"host": "", "port": "30001"}, {"host": "", "port":"30002"}, {"host":"", "port":"30003"}]\n    cc = ClusterClient(startup_nodes = startup_nodes)\n    hash_tags = cc.execute_command("RG.PYEXECUTE",  "gb = GB('ShardsIDReader').map(lambda x:hashtag()).run()")[0]\n    print(hash_tags)\n    for hash_tag in hash_tags:\n        print("Loading model bert-qa{%s}" %hash_tag.decode('utf-8'))\n        cc.modelset('bert-qa{%s}' %hash_tag.decode('utf-8'), 'TORCH', 'CPU', model)\n        print(cc.infoget('bert-qa{%s}' %hash_tag.decode('utf-8')))\n

The full code is available on GitHub.


And when a question comes from the user, we tokenize and append the question to the list of potential answers before running the RedisAI model:

    token_key = f"tokenized:bert:qa:{sentence_key}"\n    # encode question\n    input_ids_question = tokenizer.encode(question, add_special_tokens=True, truncation=True, return_tensors="np")\n    t=redisAI.getTensorFromKey(token_key)\n    input_ids_context=to_np(t,np.int64)\n    # merge (append) with potential answer, context - is pre-tokenized paragraph\n    input_ids = np.append(input_ids_question,input_ids_context)\n    attention_mask = np.array([[1]*len(input_ids)])\n    input_idss=np.array([input_ids])\n    num_seg_a=input_ids_question.shape[1]\n    num_seg_b=input_ids_context.shape[0]\n    token_type_ids = np.array([0]*num_seg_a + [1]*num_seg_b)\n    # create actual model runner for RedisAI\n    modelRunner = redisAI.createModelRunner(f'bert-qa{hash_tag}')\n    # make sure all types are correct\n    input_idss_ts=redisAI.createTensorFromBlob('INT64', input_idss.shape, input_idss.tobytes())\n    attention_mask_ts=redisAI.createTensorFromBlob('INT64', attention_mask.shape, attention_mask.tobytes())\n    token_type_ids_ts=redisAI.createTensorFromBlob('INT64', token_type_ids.shape, token_type_ids.tobytes())\n    redisAI.modelRunnerAddInput(modelRunner, 'input_ids', input_idss_ts)\n    redisAI.modelRunnerAddInput(modelRunner, 'attention_mask', attention_mask_ts)\n    redisAI.modelRunnerAddInput(modelRunner, 'token_type_ids', token_type_ids_ts)\n    redisAI.modelRunnerAddOutput(modelRunner, 'answer_start_scores')\n    redisAI.modelRunnerAddOutput(modelRunner, 'answer_end_scores')\n    # run RedisAI model runner\n    res = await redisAI.modelRunnerRunAsync(modelRunner)\n    answer_start_scores=to_np(res[0],np.float32)\n    answer_end_scores = to_np(res[1],np.float32)\n    answer_start = np.argmax(answer_start_scores)\n    answer_end = np.argmax(answer_end_scores) + 1\n    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end],skip_special_tokens = True))\n    log("Answer "+str(answer))\n    return answer\n\n

Checkout the full code, available on GitHub.


The process for making a BERT QA API call looks like this:




Here I use two cool features of RedisGears: capturing events on key miss and using async/await to run RedisAI on each shard without locking the primary thread - so that Redis Cluster can continue to serve other customers. For benchmarks, caching responses from RedisAI is disabled. If you are getting response times in nanoseconds on the second call rather then milliseconds, check to make sure the line linked above is commented out.


Running the Benchmark


Pre-requisites for running the benchmark:


Assuming you are running Debian or Ubuntu and have Docker and docker-compose installed (or can create a virtual environment via conda), run the following commands:

git clone --recurse-submodules https://github.com/applied-knowledge-systems/the-pattern.git\ncd the-pattern\n./bootstrap_benchmark.sh\n

The above commands should end with a curl call to the qasearch API, since Redis caching is disabled for the benchmark.


Next, invoke curl like this:

time curl -i -H "Content-Type: application/json" -X POST -d '{"search":"Who performs viral transmission among adults"}' http://localhost:8080/qasearch\n

Expect the following output, or something similar based on your runtime environment:

HTTP/1.1 200 OK\nServer: nginx/1.18.0 (Ubuntu)\nDate: Sun, 29 May 2022 12:05:39 GMT\nContent-Type: application/json\nContent-Length: 2120\nConnection: keep-alive\n\n{"links":[{"created_at":"2002","rank":13,"source":"C0001486","target":"C0152083"}],"results":[{"answer":"adenovirus","sentence":"The medium of 40 T150 flasks of adenovirus transducer dec CAR CHO cells yielded 0 5 1 my of purified msCEACAM1a 1 4 protein","sentencekey":"sentence:PMC125375.xml:{mG}:202","title":"Crystal structure of murine sCEACAM1a[1,4]: a coronavirus receptor in the CEA family"}] OUTPUT_REDUCTED}\n

I modified the output of API for the benchmark to return results from all shards - even if the answer is empty, in the run above five shards return answers, overall API call response under second with all additional hops to search in RedisGraph.


I modified the output of the API for the benchmark to return results from all shards - even if the answer is empty. In the run above five shards return answers. The overall API call response takes less than one second with all additional hops to search in RedisGraph!




Deep Dive into the Benchmark


Let's dig deeper into what's happening under the hood:


You should have a sentence key with shard id, which you get by looking at the "Cache key" from docker logs -f rgcluster. In my setup the cache key is, "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults". If you think it looks like a function call it's because it is a function call. It is triggered if the key isn't present in the Redis Cluster, which for the benchmark will be every time since if you remember we disabled caching the output.


One more thing to figure out from the logs is the port of the shard corresponding to the hashtag, also known as the shard id. It is the text found in betweeen the curly brackets – looks like {6fd} above. The same will be in the output for the export_load script. In my case the cache key was found in "30012.log", so my port is 30012.


Next I run the following command:

redis-cli -c -p 300012 -h get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"\n

and then run the benchmark:

redis-benchmark -p 30012 -h -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"\n====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======\n  10 requests completed in 0.04 seconds\n  50 parallel clients\n  3 bytes payload\n  keep alive: 1\n\n10.00% <= 41 milliseconds\n100.00% <= 41 milliseconds\n238.10 requests per second\n

If you are wondering, -n = number of times. In this case we run the benchmark 10 times. You can also add:


csv if you want to output in CSV format


precision 3 if you want more decimals in the ms


More information about the benchmarking tool can be found on the redis.io Benchmarks page.


if you don't have redis-utils installed locally, you can use Docker as follows:

docker exec -it rgcluster /bin/bash\nredis-benchmark -p 30012 -h -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"\n====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======\n  10 requests completed in 1.75 seconds\n  50 parallel clients\n  99 bytes payload\n  keep alive: 1\n  host configuration "save":\n  host configuration "appendonly": no\n  multi-thread: no\n\nLatency by percentile distribution:\n0.000% <= 243.711 milliseconds (cumulative count 1)\n50.000% <= 987.135 milliseconds (cumulative count 5)\n75.000% <= 1577.983 milliseconds (cumulative count 8)\n87.500% <= 1662.975 milliseconds (cumulative count 9)\n93.750% <= 1744.895 milliseconds (cumulative count 10)\n100.000% <= 1744.895 milliseconds (cumulative count 10)\n\nCumulative distribution of latencies:\n0.000% <= 0.103 milliseconds (cumulative count 0)\n10.000% <= 244.223 milliseconds (cumulative count 1)\n20.000% <= 409.343 milliseconds (cumulative count 2)\n30.000% <= 575.487 milliseconds (cumulative count 3)\n40.000% <= 821.247 milliseconds (cumulative count 4)\n50.000% <= 987.135 milliseconds (cumulative count 5)\n60.000% <= 1157.119 milliseconds (cumulative count 6)\n70.000% <= 1497.087 milliseconds (cumulative count 7)\n80.000% <= 1577.983 milliseconds (cumulative count 8)\n90.000% <= 1662.975 milliseconds (cumulative count 9)\n100.000% <= 1744.895 milliseconds (cumulative count 10)\n\nSummary:\n  throughput summary: 5.73 requests per second\n  latency summary (msec):\n          avg       min       p50       p95       p99       max\n     1067.296   243.584   987.135  1744.895  1744.895  1744.895\n

The platform only has 20 articles and 8 Redis nodes (4 masters + 4 slaves), so relevance would be wrong and it doesn't need a lot of memory.




Now let's check how long our RedisAI model runs on the {6fd} shard:

\n> AI.INFO bert-qa{6fd}\n 1) "key"\n 2) "bert-qa{6fd}"\n 3) "type"\n 4) "MODEL"\n 5) "backend"\n 6) "TORCH"\n 7) "device"\n 8) "CPU"\n 9) "tag"\n10) ""\n11) "duration"\n12) (integer) 8928136\n13) "samples"\n14) (integer) 58\n15) "calls"\n16) (integer) 58\n17) "errors"\n18) (integer) 0\n\n

bert-qa{6fd} is the key of the actual (very large) model saved. The AI.INFO command gives us a cumulative duration of 8928136 microseconds and 58 calls, which is approximately 153 milliseconds per call.


Let's double-check to make sure that's right by resetting the stats and then re-runnning the benchmark.


First, reset the stats:

\n> AI.INFO bert-qa{6fd} RESETSTAT\nOK\n127.0.0.1:30012> AI.INFO bert-qa{6fd}\n 1) "key"\n 2) "bert-qa{6fd}"\n 3) "type"\n 4) "MODEL"\n 5) "backend"\n 6) "TORCH"\n 7) "device"\n 8) "CPU"\n 9) "tag"\n10) ""\n11) "duration"\n12) (integer) 0\n13) "samples"\n14) (integer) 0\n15) "calls"\n16) (integer) 0\n17) "errors"\n18) (integer) 0\n

Then, re-run the benchmark:

redis-benchmark -p 30012 -h -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"\n====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======\n  10 requests completed in 1.78 seconds\n  50 parallel clients\n  99 bytes payload\n  keep alive: 1\n  host configuration "save":\n  host configuration "appendonly": no\n  multi-thread: no\n\nLatency by percentile distribution:\n0.000% <= 188.927 milliseconds (cumulative count 1)\n50.000% <= 995.839 milliseconds (cumulative count 5)\n75.000% <= 1606.655 milliseconds (cumulative count 8)\n87.500% <= 1692.671 milliseconds (cumulative count 9)\n93.750% <= 1779.711 milliseconds (cumulative count 10)\n100.000% <= 1779.711 milliseconds (cumulative count 10)\n\nCumulative distribution of latencies:\n0.000% <= 0.103 milliseconds (cumulative count 0)\n10.000% <= 189.183 milliseconds (cumulative count 1)\n20.000% <= 392.191 milliseconds (cumulative count 2)\n30.000% <= 540.159 milliseconds (cumulative count 3)\n40.000% <= 896.511 milliseconds (cumulative count 4)\n50.000% <= 996.351 milliseconds (cumulative count 5)\n60.000% <= 1260.543 milliseconds (cumulative count 6)\n70.000% <= 1456.127 milliseconds (cumulative count 7)\n80.000% <= 1606.655 milliseconds (cumulative count 8)\n90.000% <= 1692.671 milliseconds (cumulative count 9)\n100.000% <= 1779.711 milliseconds (cumulative count 10)\n\nSummary:\n  throughput summary: 5.62 requests per second\n  latency summary (msec):\n          avg       min       p50       p95       p99       max\n     1080.454   188.800   995.839  1779.711  1779.711  1779.711\n

Now check the stats again:

AI.INFO bert-qa{6fd}\n 1) "key"\n 2) "bert-qa{6fd}"\n 3) "type"\n 4) "MODEL"\n 5) "backend"\n 6) "TORCH"\n 7) "device"\n 8) "CPU"\n 9) "tag"\n10) ""\n11) "duration"\n12) (integer) 1767749\n13) "samples"\n14) (integer) 20\n15) "calls"\n16) (integer) 20\n17) "errors"\n18) (integer) 0\n

Now we get 88387.45 microseconds per call ~0.088387 seconds, which is pretty fast! Also, considering we started with 10 seconds per call, I think the benefits of using RedisAI in combination with RedisGears are pretty obvious. However, the trade-off is high memory usage.


There are many ways to optimize this deployment. For example, you can add a FP16 quantization and ONNX runtime. If you would like to try that, this script will be a good starting point.


Using Grafana to monitor RedisGears throughput, CPU, and Memory usage


Thanks to the contribution of Mikhail Volkov, we can now observe RedisGears and RedisGraph throughput and memory consumption using Grafana. When you cloned repository it started Graphana Docker, which has pre-build templates to monitor RedisCluster, including RedisGears and RedisAI, and Graph - which is Redis with RedisGraph. "The Pattern" dashboard provides an overview, with all the key benchmark metrics you care about:






This post is in collaboration with Redis.

\n" },{"title": "Building a Pipeline for Natural Language Processing using RedisGears", "url": "https://reference-architecture.ai/docs/nlp/", "body": "



Disclaimer originally published in collaboration with Ajeet Raina on Developer.Redis.Com


In this tutorial, you will learn how to build a pipeline for Natural Language Processing(NLP) using RedisGears. For this demonstration, we will be leveraging the Kaggle CORD19 datasets. The implementation is designed to avoid running out of memory, leveraging Redis Cluster and RedisGears, where the use of RedisGears allows for processing data on storage without the need to move data in and out of the Redis Cluster—using Redis Cluster as data fabric. Redis Cluster allows for horizontal scalability up to 1000 nodes, and together with RedisGears, provides a distributed system where data science/ML engineers can focus on processing steps, without the worry of writing tons of scaffoldings for distributed calculations.




This project was built with the aim to make it easier for other people to contribute and build better information and knowledge management products.


Why data scientists uses RedisGears?


RedisGears have enormous potential, particularly for text processing—you can process your data “on data” without needing to move them in and out of memory. Summary of the important points:


What is a knowledge graph?


Today, we live in the world of new systems that operate not just files, folders, or web pages, but entities with their properties and relationships between them, organized into hierarchies of classes and categories. These systems are used everywhere from the military-industrial complex to our everyday lives. Palantir, Primer, and other data companies enable massive intelligence and counterintelligence projects in military and security forces, Quid and RecordedFuture enable competitive analytics, Bottlenose and similar enterprises enable online reputation analytics. Microsoft Graph enables new kinds of productivity apps for the enterprises, Google Knowledge Graph and Microsoft’s Satori enable everyday search queries, and together with Amazon Information Graph they power corresponding AI assistants by enabling them to answer questions about the world facts


All these (and many other more specialized) systems are used in different domains, but all of them use Knowledge Graphs as their foundation.


Knowledge graphs are one of the best ways to connect and make sense out of information from different data sources, following the motto of one of the vendors— “It’s about things not strings”.


Knowledge Graph consists of thesaurus, taxonomy and ontology. In this pipeline I assume knowledge is captured in medical metathesaurus UMLS and concepts in text are related if they are part of the same sentence, therefore concept become node, their relationship becomes edge:




Concepts have CUI (Concept Unique Identifiers) and those will be primary keys in nodes, linked to UMLS thesaurus. For example, if you search, “How does temperature and humidity affect the transmission of 2019-nCoV?” on the demo website http://thepattern.digital/ and move slider to 1996, there is an edge-connecting transmission (C5190195) and birth (C5195639) and the part of sentence matched, “the rate of transmission to an infant born to,” from the report titled, “Afebrile Pneumonia in infants.”




RedisGears for NLP pre-processing


Overall Architecture Overview (Components Diagram)




Intake step - is very simple put all JSON records into RedisCluster, then NLP pipeline starts processing all records, code is here.


How does the NLP pipeline steps fit into RedisGears?

    For each record — detect language (discard non English), it’s filter

  3. \n

    Map paragraphs into a sentence — flatmap

  5. \n

    Sentences spellchecker — it’s map

  7. \n

    Save sentences into hash — processor

  8. \n

Step 1. Pre-requisite


Ensure that you install virtualenv in your system


Step 2. Clone the repository

 git clone --recurse-submodules https://github.com/applied-knowledge-systems/the-pattern.git\n cd the-pattern\n

Step 3. Bring up the application

 docker-compose -f docker-compose.dev.yml up --build -d\n

Step 4. Apply cluster configuration settings


You can deploy PyTorch and spacy to run on RedisGears.

 bash post_start_dev.sh\n

For Data science-focused deployment, RedisCluster should be in HA mode with at least one slave for each master. \nOne need to change a few default parameters for rgcluster to accommodate the size of PyTorch and spacy libraries (each over 1GB zipped), gist with settings.


Step 5. Create or activate Python virtual environment

 cd ./the-pattern-platform/\n

Step 6. Create new environment


You can create it via

 conda create -n pattern_env python=3.8\n



Alternatively, you can activate by using the below CLI:

 source ~/venv_cord19/bin/activate #or create new venv\n pip install -r requirements.txt\n

Step 7. Run pipeline

 bash cluster_pipeline.sh\n

Step 8. Validating the functionality of the NLP pipeline


Wait for a bit and then check:


Verifying Redis Graph populated:

 redis-cli -p 9001 -h GRAPH.QUERY cord19medical "MATCH (n:entity) RETURN count(n) as entity_count" \n redis-cli -p 9001 -h GRAPH.QUERY cord19medical "MATCH (e:entity)-[r]->(t:entity) RETURN count(r) as edge_count"\n

Checking API responds:

 curl -i -H "Content-Type: application/json" -X POST -d '{"search":"How does temperature and humidity affect the transmission of 2019-nCoV"}'      \n http://localhost:8080/gsearch\n



While RedisGears allows to deploy and run Machine Learning libraries like spacy and BERT transformers, the solution above uses simpler approach:

 gb = GB('KeysReader')\n gb.filter(filter_language)\n gb.flatmap(parse_paragraphs)\n gb.map(spellcheck_sentences)\n gb.foreach(save_sentences)\n gb.count()\n gb.register('paragraphs:*',keyTypes=['string','hash'], mode="async_local")\n

This is the overall pipeline: those 7 lines allow you to run logic in a distributed cluster or on a single machine using all available CPUs - no changes required until you need to scale over more 1000 nodes. I use KeysReader registered for namespace paragraphs for all strings or hashes. My pipeline would need to run in async mode. For data scientists, I would recommend using gb.run to make sure gears function work and it will run in batch mode and then change it to register - to capture new data. By default, functions will return output, hence the need for count() - to prevent fetching the whole dataset back to the command issuing machine (90 GB for Cord19).


Overall pre-processing is a straightforward - full code is here.


Things to keep in mind:

  1. Node process can only save locally - we don't move data, anything you want to save should have hashtag, for example to add to the set of processed_docs:
 execute('SADD','processed_docs_{%s}' % hashtag(),article_id)\n
  1. Loading external libraries into the computational threat, for example, symspell requires additional dictionaries and needs two steps to load:
 """\n load symspell and relevant dictionaries\n """\n sym_spell=None \n\n def load_symspell():\n  import pkg_resources\n  from symspellpy import SymSpell, Verbosity\n  sym_spell = SymSpell(max_dictionary_edit_distance=1, prefix_length=7)\n  dictionary_path = pkg_resources.resource_filename(\n      "symspellpy", "frequency_dictionary_en_82_765.txt")\n  bigram_path = pkg_resources.resource_filename(\n      "symspellpy", "frequency_bigramdictionary_en_243_342.txt")\n  # term_index is the column of the term and count_index is the\n  # column of the term frequency\n  sym_spell.load_dictionary(dictionary_path, term_index=0, count_index=1)\n  sym_spell.load_bigram_dictionary(bigram_path, term_index=0, count_index=2)\n  return sym_spell\n
  1. Scispacy is a great library and data science tool, but after a few iterations with deploying it I ended up reading data model documentation for UMLS Methathesaurus and decided to build Aho-Corasick automata directly from UMLS data. (MRXW_ENG.RRF contains all terms form for English mapped to CUI). Aho-Corasick allowed me to match incoming sentences into pairs of nodes (concepts from the medical dictionary) and present sentences as edges in a graph, Gears related code is simple:
 bg = GearsBuilder('KeysReader')\n bg.foreach(process_item)\n bg.count()\n bg.register('sentence:*',  mode="async_local",onRegistered=OnRegisteredAutomata)\n\n

OnRegisteredAutomata will perform similarly to symspell example above except it will download pre-build Aho-Corasick automata (30Mb). \nAho-Corasick is a very fast matcher and allows to perform >900 Mb text per second even on commodity laptop, RedisGears cluster makes a very smooth distribution of data and ML model and matching using available CPU and Memory. Full matcher code.


Output of the matcher: nodes and edges are candidates to use another RedisGears pattern rgsync where you can write fast into Redis and RedisGears are going to replicate data into slower storage using RedisStreams.\nBut I decided to use streams and handcraft the population of the RedisGraph database, which will be focus of the next blog post.


Call to action


We took OCR scans in JSON format and turned them into Knowledge Graph, demonstrating how you can traditional Semantic Network/OWL/Methathesaurus technique based on Unified Medical Language System. Redis Ecosystem offers a lot to the data science community, and can take place at the core of Kaggle notebooks, ML frameworks and make deployment and distribution of data more enjoyable. The success of our industry depends on how our tools work together — regardless of whether they are engineering, data science, machine learning and organisational or architectural.


With the collaboration of RedisLabs and community, the full pipeline code is available via https://github.com/applied-knowledge-systems/the-pattern-platform. In case, you want to try it locally, then you can find a Docker Launch script in the root of the repository along with short quickstart guide. PR and suggestions are welcome. The overall goal of the project is to allow other to build their more interesting pipeline on top of it.



\n\n" },{"title": "Metadata Management", "url": "https://reference-architecture.ai/docs/metadata/", "body": "

In CORD 19 dataset mentioned in Data Acquisition Metadata stored in the separate csv file from the source data. Here simple script to parse date/times and attach it to JSON/XML files

\n" },{"title": "Contribution Guidelines", "url": "https://reference-architecture.ai/docs/contribution/", "body": "

General guidelines for contributing to the project.


Main goals


Be data-driven


Engineering approach


There should be a a path to be implemented in a real world - good prototype or production deployment.


How to contribute


When contributing you agreeing to share your contribution under




When contributing you agreeing to share your contribution under

\n" },{"title": "The Pattern: Machine Learning Natural Language Processing meets VR/AR", "url": "https://reference-architecture.ai/docs/ai-product/", "body": "

To fight ever-increasing complexity, "The Pattern" projects help find relevant knowledge using Artificial Intelligence and novel UX elements, all powered by Redis - a new generation real-time data fabric turned into knowledge fabric


Overall repository for CORD19 medical NLP pipeline, API and UI, design and architecture.


Demo Video:



Demo Server (no persistance): https://thepattern.digital/


The challenge


The medical profession put a lot of effort into collaboration, starting from Latin as a common language to industry-wide thesauruses like UMLS. However, if full of scandals where publications in a prestigious journal would be retracted, and the World Health Organisation would change its policy advice based on the article. I think "paper claiming that eating a bat-like Pokémon sparked the spread of COVID-19" takes a prize. One would say that editors in those journals don't do their job, and while it may seem true, I would say they had no chance: with a number of publications about COVID (SARS-V) passing 300+ per day, we need better tools to navigate via such flow of information.\nWhen exploring science or engineering topics, I look at the diversity of the opinion, not the variety of the same cluster of words or the same thought. I want to avoid confirmation bias. I want to find articles relevant to the same concept, not necessarily the ones which have similar words. My focus is to build a natural language processing pipeline capable of handling a large number of documents and concepts, incorporating System 1 AI (fast, intuitive reasoning) and System 2 (high-level reasoning) and then present knowledge in a modern VR/AR visualisation. Search or rather information exploration should be spatial, preferably in VR (memory palace, see Theatre of Giulio Camillo). A force-directed graph is a path towards it, where visuals are assisted by text — relevant text pops up on the connection and where people explore the concepts and then dig deeper into the text. The purpose of the pipeline is that knowledge should be reusable and shareable.




Join our community on Discord or post on GitHub Discussions](https://github.com/applied-knowledge-systems/the-pattern/discussions)

\n" },{"title": "Support project by contributing", "url": "https://reference-architecture.ai/docs/donate/", "body": "

For Individuals


This is a begging of an exciting, incredible new journey; support open source projects by donating or contributing.


For Organisations


Becomes a sponsor and promote Reference Architecture for AI.


The ask is



\n\n" },{"title": "Data Acquisition", "url": "https://reference-architecture.ai/docs/intake/", "body": "

For the Reference Architecture for AI, we used Kaggle Cord19 dataset, "COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 1,000,000 scholarly articles, including over 400,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease."


Ingest documents


Example script parses documents taking out body_text and saves under paragraphs in Redis cluster.

\n\n" },{"title": "Turning Open Source project into Product with Redis Enterprise", "url": "https://reference-architecture.ai/posts/github-oauth2/", "body": "

Turning Open Source project into Product with Redis Enterprise








Last year, my reference project, "The Pattern", was the hackathon winner 2021 and got a bit of publicity and, in total, seven forks. But as with many open source projects, it is now stale. Time to revive "The Pattern" with new features and GitHub sponsors or Patreon patrons to help and inspire developers and creatives. In return, it's common to provide sponsor-only features and articles. Nevertheless, how can we do it with a large Redis-based machine learning pipeline?


Plan sponsor only features


This article will introduce a simple first step:\nfor GitHub sponsors, we start with offering persistent storage of preferences: I have a simple flask POST API which adds nodes into the user's preference storage - a simple Redis set per user. And it will be a foundation to build other sponsor-only features.\nFor now, let's cover the basics:


Overall architecture overview

flowchart LR\n id1(User) --> flask_login(Flask Login API)--> github(GitHub OAuth2)\n github-->flask_callback(Flask API callback)-->GitHubGraphQL(GitHub GraphQL)

Add Github oauth2 to Rest API


There are a number of API's that GitHub offers to help developers, but the GitHub Authentication API is one of the most popular. This API allows you to log in to GitHub using your username and password, or OAuth token.


A login button with a standard OIDC/OAuth2 dance is one of the most common ways for a user to authenticate to an API.\nBelow is code taken from this gist and is very common for OAuth2 flows:

import os \nclient_id = os.getenv('GITHUB_CLIENT_ID')\nclient_secret = os.getenv('GITHUB_SECRET')\n@app.route('/', methods=['GET', 'POST'])\ndef index():\n    url = 'https://github.com/login/oauth/authorize'\n    params = {\n        'client_id': client_id,\n\n        'scope': 'read:user,read:email',\n        'state': str(uuid4().hex),\n        'allow_signup': 'true'\n    }\n    url = furl(url).set(params)\n    return redirect(str(url), 302)\n

where GITHUB_CLIENT_ID and GITHUB_SECRET are client/secret GitHub Oauth2 apps. Register for following process on GitHub

org_name="applied-knowledge-systems"\n@app.route('/oauth2/callback')\ndef oauth2_callback():\n\n    code = request.args.get('code')\n    access_token_url = 'https://github.com/login/oauth/access_token'\n    payload = {\n        'client_id': client_id,\n        'client_secret': client_secret,\n        'code': code,\n        # 'redirect_uri':\n        'state': str(uuid4().hex)\n    }\n    r = requests.post(access_token_url, json=payload, headers={'Accept': 'application/json'})\n    access_token = json.loads(r.text).get('access_token')\n    print(access_token)\n    \n    access_user_url = 'https://api.github.com/user'\n    response = requests.get(access_user_url, headers={'Authorization': 'token ' + access_token})\n    data=response.json()\n    user_email=data["email"]\n    user_login=data["login"]\n    user_id=data["id"]\n    # response=redirect(url_for('login',next=redirect_url()))\n    # response.set_cookie('user_id', str(user_id))\n    # response.set_cookie('user_login', str(user_login))\n    # return response\n    query = """\n        {\n        viewer {\n            sponsorshipsAsSponsor(first: 100) {\n            nodes {\n                sponsorable {\n                ... on User {\n                    id\n                    email\n                    url\n                }\n                ... on Organization {\n                    id\n                    email\n                    name\n                    url\n                }\n                }\n                tier {\n                id\n                name\n                monthlyPriceInDollars\n                monthlyPriceInCents\n                }\n            }\n            }\n        }\n        }\n        """\n    response_graphql = requests.post('https://api.github.com/graphql', json={'query': query}, headers={'Authorization': 'token ' + access_token})\n    response_graphql_data=response_graphql.json()["data"]\n    if isinstance (response_graphql_data["viewer"]["sponsorshipsAsSponsor"]["nodes"], list):\n      if response_graphql_data["viewer"]["sponsorshipsAsSponsor"]["nodes"][0]["sponsorable"]["name"]==org_name:\n          # if user is a sponsor of Applied Knowledge System add them to set of sponsors\n         redis_client.sadd(f'sponsors:{org_name}',user_id)\n    # if RedisJSON enabled:\n    # redis_client.json().set(f"user_details:{user_id}", '$', {\n    #     'email': user_email,\n    #     'id': user_id,\n    #     'user_login': user_login,\n    #     'graphql': response_graphql_data,\n    # }) \n    #if not\n    redis_client.hset(f"user_details:{user_id}", mapping={\n        'email': user_email,\n        'id': user_id,\n        'user_login': user_login\n    })\n    return jsonify({\n        'status': 'success',\n        'email': user_email,\n        'id': user_id,\n        'user_login': user_login\n    })\n

The API we are using for our sponsor-only feature is straightforward:

@app.route('/exclude', methods=['POST','GET'])\ndef mark_node():\n    if request.method == 'POST':\n        if 'id' in request.json:\n            node_id=request.json['id']\n    else:\n        if 'id' in request.args:\n            node_id=request.args.get('id')\n    user_id = session.get('user_id')\n    log(f"Got user {user_id} from session")\n    if not user_id:\n        user_id = request.cookies.get('user_id')\n        log(f"Got user {user_id} from cookie")\n    redis_client.sadd("user:%s:mnodes" % user_id,node_id)\n    response = jsonify(message=f"Finished {node_id} and {user_id}")\n    return response\n

And the only purpose of this API is to mark nodes as unimportant for the given user by adding nodes to RedisSet, and those nodes will be excluded from search API output. So far, everything was pretty standard: basic flask API and GitHub Social login flow. Now let's add Redis Enterprise and synchronise sponsors preferences.


Add Redis Enterprise


Why not use Redis Enterprise directly for everything? \nThe project is memory-heavy, with a lot of data and machine learning inside Redis. This allows to achieve state-of-the-art performance, but it also takes over 120 GB RAM (or as much RAM as you can give it), and 128 GB Redis Enterprise instance will exceed my budget for open-source project. Obviously if there will be enough sponsors we can move more functionality into Redis Enterprise, but for that we need to finish building basic blocks. Register on Redis.com cloud and create a database with the subscription.


\"Redis\nTake a note host, port and password for Redis Enterprise and create docker enviroment file:

cat .env.gears \nREDISENT_PWD="123"\nREDISENT_PORT="13444"\nREDISENT_HOST="hostname.cloud.redislabs.com"\n

and create a docker compose with section passing .env.gears. Mine looks like this:

  redisgraph:\n    image: redislabs/redismod\n    container_name: redisgears\n    hostname: redisgears\n    env_file:\n      - ./.env.gears\n    ports:\n      -\n\n

Synchronize Redis OSS to Redis Enterprise using RedisGears


Synchronize all user preferences


First flow:\nWe will be using RedisGears to synchronize all preferences with Redis Enterprise

flowchart LR\n id1(User Preferences Redis OSS) --> redis_gears1(Redis Gears)--> redise(Redis Enterprise)

If you are new to RedisGears, there is a pattern rgsync that covers exacly this use case, but I already have RedisGears, so I am going to build it step by step:

# gears_sync_preferences.py\nrconn=None\n\ndef connecttoRedisEnterise():\n    import redis \n    import os \n    log(str(os.environ))\n    # Get environment variables\n\n    HOST = os.getenv('REDISENT_HOST')\n    PASSWORD = os.getenv('REDISENT_PWD')\n    PORT = os.getenv('REDISENT_PORT')\n\n    redis_client=redis.Redis(host=HOST,port=PORT,charset="utf-8", password=PASSWORD, decode_responses=True)\n    return redis_client\n\ndef sync_users(record):\n    global rconn\n    if not rconn:\n        rconn=connecttoRedisEnterise()\n    # Uncomment logs to check \n    # log(str(record['key']))\n    # log(str(record['value']))\n\n    rconn.hset(record['key'],mapping=record['value'])\n\ngb = GB()\ngb.foreach(sync_users)\ngb.count()\ngb.run('user_details:*')\n

this is a "batch" mode for RedisGears, which is easier to debug than streams. Install gears-cli](https://github.com/RedisGears/gears-cli) with pip install gears-cli run above script:

gears-cli run --host --port 9001 gears_sync_preferences.py --requirements req_sync.txt\n

where req_sync.txt


This RedisGears will copy all user's profiles into RedisEnterprise. Now let us add sponsors:

# gears_sync_sponsors.py\nrconn=None\n\ndef remove_prefix(text, prefix):\n    return text[text.startswith(prefix) and len(prefix):]\n\ndef connecttoRedisEnterise():\n    import redis \n    import os \n    log(str(os.environ))\n    # Get environment variables\n\n    HOST = os.getenv('REDISENT_HOST')\n    PASSWORD = os.getenv('REDISENT_PWD')\n    PORT = os.getenv('REDISENT_PORT')\n    log(HOST)\n    log(PORT)\n    log(PASSWORD)\n    redis_client=redis.Redis(host=HOST,port=PORT,charset="utf-8", password=PASSWORD)\n    return redis_client\n\n\n\ndef sync_sponsors(record):\n    global rconn\n    if not rconn:\n        rconn=connecttoRedisEnterise()\n\n    log(str(record['key']))\n    values=execute('SMEMBERS',record['key'])\n    log(str(values))\n    for each_value in values:    \n        rconn.sadd(record['key'],each_value)\n\ngb = GB('KeysReader')\ngb.foreach(sync_sponsors)\ngb.count()\ngb.run('user:*')\n

But this one will sync all user's preferences, but we only need sponsors - let us add another feature of RedisGears - filter:

rconn=None\n\ndef remove_prefix(text, prefix):\n    return text[text.startswith(prefix) and len(prefix):]\n\ndef connecttoRedisEnterise():\n    import redis \n    import os \n    log(str(os.environ))\n    # Get environment variables\n\n    HOST = os.getenv('REDISENT_HOST')\n    PASSWORD = os.getenv('REDISENT_PWD')\n    PORT = os.getenv('REDISENT_PORT')\n\n    redis_client=redis.Redis(host=HOST,port=PORT,charset="utf-8", password=PASSWORD, decode_responses=True)\n    return redis_client\n\ndef filter_sponsors(record):\n    org_name="applied-knowledge-systems"\n    user_id = remove_prefix(record['key'],'user:')\n    sponsor=execute('SISMEMBER',f'sponsors:{org_name}',user_id)\n    return bool(sponsor==True)\n\ndef sync_sponsors(record):\n    global rconn\n    if not rconn:\n        rconn=connecttoRedisEnterise()\n\n    log(str(record['key']))\n    values=execute('SMEMBERS',record['key'])\n    log(str(values))\n    for each_value in values:    \n        rconn.sadd(record['key'],each_value)\n\ngb = GB()\ngb.filter(filter_sponsors)\ngb.foreach(sync_sponsors)\ngb.count()\ngb.run('user:*')\n

Fetch sponsor's preferences back to Redis OSS from Redis Enterprise


Then we are going to use Key miss events from Redis Gears to fetch data for all users:

flowchart LR\n redise(Redis Enterprise)\n redis_gears2(Redis Gears)--key miss--->redise\n redis_gears2-->redisOSS[Redis OSS]

and it's very easy, right from key miss example:

\ndef fetch_data(r):\n    key = r['key']\n    global rconn\n    if not rconn:\n        rconn=connecttoRedisEnterise()\n    values=rconn.smembers(record['key'])\n    log(str(values))\n    for each_value in values:    \n        execute('SADD',record['key'],each_value)\n\nGB().foreach(fetch_data).register(prefix='user:*', commands=['smember'],eventTypes=['keymiss'], mode="async_local")\n

There is one more option - to turn fetch_data into the async call, by wrapping it into async/await, but Redis Enterprise is fairly fast, and I don't think it's worth adding an async call in this case. For curiosity, see the example code in The Pattern repository.




In this article, we walked through steps on how to create sponsor-specific "nanoservices" using RedisOSS, RedisGears and Redis Enterprise. This allows us to leverage the best of all worlds open source Redis, high availability and persistence with Redis Enterprise and RedisGears as the glue which holds everything together.


This post is in collaboration with Redis.



\n\n" },{"title": "Announcing Reference Architecture for AI", "url": "https://reference-architecture.ai/posts/post-0/", "body": "

There are tools for advanced analytics, including free ones from Google and Kaggle.


There are well-known and validated deployment architectures for applications and the cloud.


Yet the number of practical applications is still tiny, and they retained niche implementations.\nWhile the benefits of AI are clear, there are still many gaps in AI architecture that need to be filled. For example, there is a gap between analytical tools and verified architectures for real-time deployments. This gap often stems from a lack of specific reference architectures and patterns, demonstrating the trade-offs between technologies, libraries, and tools.


Let's bridge the gap in knowledge and drive a connection between science and engineering to make fast, efficient, and practical AI deployments.\nThree things need to be in place to build an AI product:

  1. AI Product itself
  2. \n
  3. Core Capabilities required to build AI/ML product
  4. \n
  5. Enabling capabilities
  6. \n

I will use The Pattern, my [“Build on Redis” Hackathon prize-winning open source](https://github.com/applied-knowledge-systems/the-pattern) project, to illustrate how the capabilities below can be implemented and invite you to contribute or donate.


We launch in two full-featured articles - NLP ML pipeline for turning unstructured JSON text into a knowledge graph and fresh off the press Benchmarks for BERT Large Question Answering inference for RedisAI and RedisGears with Grafana Dashboards by Mikhail Volkov

\n" }]