Аннотация:It has become more challenging for the security analysts to identify cyber threat related content on the Internet because of the vast amount of publicly available digital texts. In this research, we proposed building an autonomous system for extracting cyber threat information from publicly available information sources. We tested a neural embedding method called doc2vec as a natural language filter for the proposed system. With cybersecurity-specific training data and custom preprocessing, we were able to train a doc2vec model and evaluate its performance. According to our evaluation, the natural language filter was able to identify cybersecurity specific natural language text with 83% accuracy.