To see the other types of publications on this topic, follow the link: One-hot Encoder.

Journal articles on the topic 'One-hot Encoder'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'One-hot Encoder.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

SHELUHIN, OLEG I., ANNA V. VANYUSHINA, and MAKSIM S. ZHELNOV. "USE OF LATENT-SEMANTIC ANALYSIS IN PREPARATION OF DATA FOR IDENTIFICATION OF ANONYMOUS USERS BY DIGITAL FINGERPRINTS." H&ES Research 14, no. 1 (2022): 36–44. http://dx.doi.org/10.36724/2409-5419-2022-14-1-36-44.

Full text
Abstract:
Digital fingerprints changings over time as a result of system, plugins, browsers, installation of various programs updates and fonts is a serious problem in the method of tracking and identifying users using a browser (Fingerprinting (FP) of a web browser). The set of parsed attributes can contain both metrical and categorical (mostly non-numeric) values, for example, parameters such as user-agent, webgl, canvas, etc. Considering this, it is required to pre-encode them for the convenience of further processing. For these purposes, artificial intelligence technologies, including the processing of text in natural languages NLP (Natural Language Processing), are widely used. The aim of the research is to analyze the peculiarities of the implementation of latent-semantic analysis (LSA) in the preparation and analysis of FP data for the identification of anonymous users. Methods. A comparative analysis of the common ways of converting categorical values of fingerprint attributes (FP) into numeric One-Hot-Encoding, Label-Encoder, LSA for identifying anonymous users with a predetermined number of possible values of categorical features is carried out. Results. The advantage of the LSA algorithm over One-Hot-Encoding, Label-Encoder is shown. The possibility of clustering implementation within the framework of the user identification problem by visualizing FP (FP) relative to hidden semantic topics using the LSA model of latent semantic analysis is shown. It is shown that with a small number of hid& den topics using the obtained vectors of objects and vectors of terms for assessing the similarity of two FPs, the proposed model allows us to confidently classify the input FP to a common topic. With the help of the obtained vectors of objects and vectors of terms for assessing the similarity of two FP objects, it becomes possible to apply various measures of cluster proximity: Euclidean distance, cosine measure, etc.
APA, Harvard, Vancouver, ISO, and other styles
2

Lv, Zhibin, Hui Ding, Lei Wang, and Quan Zou. "A Convolutional Neural Network Using Dinucleotide One-hot Encoder for identifying DNA N6-Methyladenine Sites in the Rice Genome." Neurocomputing 422 (January 2021): 214–21. http://dx.doi.org/10.1016/j.neucom.2020.09.056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Teng, Lin, Hang Li, and Shahid Karim. "DMCNN: A Deep Multiscale Convolutional Neural Network Model for Medical Image Segmentation." Journal of Healthcare Engineering 2019 (December 27, 2019): 1–10. http://dx.doi.org/10.1155/2019/8597606.

Full text
Abstract:
Medical image segmentation is one of the hot issues in the related area of image processing. Precise segmentation for medical images is a vital guarantee for follow-up treatment. At present, however, low gray contrast and blurred tissue boundaries are common in medical images, and the segmentation accuracy of medical images cannot be effectively improved. Especially, deep learning methods need more training samples, which lead to time-consuming process. Therefore, we propose a novelty model for medical image segmentation based on deep multiscale convolutional neural network (CNN) in this article. First, we extract the region of interest from the raw medical images. Then, data augmentation is operated to acquire more training datasets. Our proposed method contains three models: encoder, U-net, and decoder. Encoder is mainly responsible for feature extraction of 2D image slice. The U-net cascades the features of each block of the encoder with those obtained by deconvolution in the decoder under different scales. The decoding is mainly responsible for the upsampling of the feature graph after feature extraction of each group. Simulation results show that the new method can boost the segmentation accuracy. And, it has strong robustness compared with other segmentation methods.
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Xiaoning, Zhihao Ke, Yining Chen, and Zigang Deng. "The feasibility of designing a back propagation neural network to predict the levitation force of high-temperature superconducting magnetic levitation." Superconductor Science and Technology 35, no. 4 (March 3, 2022): 044004. http://dx.doi.org/10.1088/1361-6668/ac55f5.

Full text
Abstract:
Abstract The levitation force between the superconductor and the magnet is highly nonlinear and affected by the coupling of multiple factors, which brings many obstacles to research and application. In addition to experimental methods and finite element simulations, the booming artificial neural network (ANN) which is adept at continuous nonlinear fitting may provide another solution to predict the levitation force. And this topic has not been deeply investigated so far. Therefore, this study aims to apply the ANN to predict the levitation force, and a typical neural network applied with the back propagation (BP) is adopted. The data set with 2399 pieces of data considers nine input factors and one force output, which was experimentally obtained by several test devices. The pre-process of the data set contains cleaning, balancing, one-hot encoding (for the discrete classified variable), normalization (for the continuous variable) and randomization. A classical perception with three layers (input, hidden and output layer) is applied in this paper. And the gradient descent back propagation algorithm reduces the error by iteration. Through the assessment and evaluation of the network, a great prediction accuracy could achieve. The prediction results could well illustrate the features of force (nonlinear, hysteresis, external field dependence and type difference between the bulk and stack), which confirm the feasibility of using a BP neural network to predict the levitation force. Furthermore, the performance of the neural network is determined by the data set, especially the uniformity and balance among factors in the set. Moreover, the huge gap in the quantity of data between factors disturbs the network to make a comprehensive judgment, and in this situation, the binary one-hot encoding of the small quantity and discrete data factor is efficient, instead of the actual value of the factor, the one-hot encoded data only represent the category. Moreover, a label encoder method is adopted to distinguish the decent and ascend (decent = 1, ascent = 0) for the force hysteresis.
APA, Harvard, Vancouver, ISO, and other styles
5

Kapočiūtė-Dzikienė, Jurgita. "A Domain-Specific Generative Chatbot Trained from Little Data." Applied Sciences 10, no. 7 (March 25, 2020): 2221. http://dx.doi.org/10.3390/app10072221.

Full text
Abstract:
Accurate generative chatbots are usually trained on large datasets of question–answer pairs. Despite such datasets not existing for some languages, it does not reduce the need for companies to have chatbot technology in their websites. However, companies usually own small domain-specific datasets (at least in the form of an FAQ) about their products, services, or used technologies. In this research, we seek effective solutions to create generative seq2seq-based chatbots from very small data. Since experiments are carried out in English and morphologically complex Lithuanian languages, we have an opportunity to compare results for languages with very different characteristics. We experimentally explore three encoder–decoder LSTM-based approaches (simple LSTM, stacked LSTM, and BiLSTM), three word embedding types (one-hot encoding, fastText, and BERT embeddings), and five encoder–decoder architectures based on different encoder and decoder vectorization units. Furthermore, all offered approaches are applied to the pre-processed datasets with removed and separated punctuation. The experimental investigation revealed the advantages of the stacked LSTM and BiLSTM encoder architectures and BERT embedding vectorization (especially for the encoder). The best achieved BLUE on English/Lithuanian datasets with removed and separated punctuation was ~0.513/~0.505 and ~0.488/~0.439, respectively. Better results were achieved with the English language, because generating different inflection forms for the morphologically complex Lithuanian is a harder task. The BLUE scores fell into the range defining the quality of the generated answers as good or very good for both languages. This research was performed with very small datasets having little variety in covered topics, which makes this research not only more difficult, but also more interesting. Moreover, to our knowledge, it is the first attempt to train generative chatbots for a morphologically complex language.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Kun, Le Wu, Guangyi Lv, Meng Wang, Enhong Chen, and Shulan Ruan. "Making the Relation Matters: Relation of Relation Learning Network for Sentence Semantic Matching." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 16 (May 18, 2021): 14411–19. http://dx.doi.org/10.1609/aaai.v35i16.17694.

Full text
Abstract:
Sentence semantic matching is one of the fundamental tasks in natural language processing, which requires an agent to determine the semantic relation among input sentences. Recently, deep neural networks have achieved impressive performance in this area, especially BERT. Despite the effectiveness of these models, most of them treat output labels as meaningless one-hot vectors, underestimating the semantic information and guidance of relations that these labels reveal, especially for tasks with a small number of labels. To address this problem, we propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching. Specifically, we first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task for guiding R2-Net to consider more about labels. Meanwhile, a triplet loss is employed to distinguish the intra-class and inter-class relations in a finer granularity. Empirical experiments on two sentence semantic matching tasks demonstrate the superiority of our proposed model. As a byproduct, we have released the codes to facilitate other researches.
APA, Harvard, Vancouver, ISO, and other styles
7

Gaskarov, Rodion Dmitrievich, Alexey Mikhailovich Biryukov, Alexey Fedorovich Nikonov, Daniil Vladislavovich Agniashvili, and Danil Aydarovich Khayrislamov. "Steel Defects Analysis Using CNN (Convolutional Neural Networks)." Russian Digital Libraries Journal 23, no. 6 (August 4, 2020): 1155–71. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1155-1171.

Full text
Abstract:
Steel is one of the most important bulk materials these days. It is used almost everywhere - from medicine to industry. Detecting this material's defects is one of the most challenging problems for industries worldwide. This process is also manual and time-consuming. Through this study we tried to automate this process. A convolutional neural network model UNet was used for this task for more accurate segmentation with less training image data set for our model. The essence of this NN (neural network) is in step-by-step convolution of every image (encoding) and then stretching them to initial resolution, consequently getting a mask of an image with various classes on it. The foremost modification is changing an input image's size to 128x800 px resolution (original images in dataset are 256x1600 px) because of GPU memory size's limitation. Secondly, we used ResNet34 CNN (convolutional neural network) as encoder, which was pre-trained on ImageNet1000 dataset with modified output layer - it shows 4 layers instead of 34. After running tests of this model, we obtained 92.7% accuracy using images of hot-rolled steel sheets.
APA, Harvard, Vancouver, ISO, and other styles
8

Wu, Chunming, and Zhou Zeng. "A fault diagnosis method based on Auxiliary Classifier Generative Adversarial Network for rolling bearing." PLOS ONE 16, no. 3 (March 1, 2021): e0246905. http://dx.doi.org/10.1371/journal.pone.0246905.

Full text
Abstract:
Rolling bearing fault diagnosis is one of the challenging tasks and hot research topics in the condition monitoring and fault diagnosis of rotating machinery. However, in practical engineering applications, the working conditions of rotating machinery are various, and it is difficult to extract the effective features of early fault due to the vibration signal accompanied by high background noise pollution, and there are only a small number of fault samples for fault diagnosis, which leads to the significant decline of diagnostic performance. In order to solve above problems, by combining Auxiliary Classifier Generative Adversarial Network (ACGAN) and Stacked Denoising Auto Encoder (SDAE), a novel method is proposed for fault diagnosis. Among them, during the process of training the ACGAN-SDAE, the generator and discriminator are alternately optimized through the adversarial learning mechanism, which makes the model have significant diagnostic accuracy and generalization ability. The experimental results show that our proposed ACGAN-SDAE can maintain a high diagnosis accuracy under small fault samples, and have the best adaptation performance across different load domains and better anti-noise performance.
APA, Harvard, Vancouver, ISO, and other styles
9

Weng, Weinan. "Research on the House Price Forecast Based on machine learning algorithm." BCP Business & Management 32 (November 22, 2022): 134–47. http://dx.doi.org/10.54691/bcpbm.v32i.2881.

Full text
Abstract:
House price experiences some fluctuations every year, due to some potential factors such as location, area, facilities and so on. Housing price prediction is a significant topic of real estate, and it is beneficial for buyers to make strategy decisions about house dealing. There are many research on house price forecast, yet the current research cannot comprehensively compare and analyze the popular house price prediction approach. Constructing a model begins with pre-processing data to fill null values or remove data outliers and the categorical attribute can be shifted into required attributes by using one hot encoder methodology. This paper used the following five algorithms decision tree, random forest regression, Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), and extreme gradient boosting (XGBoost) this paper utilized to predict house prices and compared according to the root mean squared error. This paper found GBDT and XGBoost have more accurate prediction results compared with other algorithms. Besides, this paper found which features most affect the price of a house. In real-world applications, machine learning based housing price prediction models are utilized by banks and financial institutions to obtain better house price assessment, risk analysis and lending decisions.
APA, Harvard, Vancouver, ISO, and other styles
10

Tang, Xu, Chao Liu, Jingjing Ma, Xiangrong Zhang, Fang Liu, and Licheng Jiao. "Large-Scale Remote Sensing Image Retrieval Based on Semi-Supervised Adversarial Hashing." Remote Sensing 11, no. 17 (September 1, 2019): 2055. http://dx.doi.org/10.3390/rs11172055.

Full text
Abstract:
Remote sensing image retrieval (RSIR), a superior content organization technique, plays an important role in the remote sensing (RS) community. With the number of RS images increases explosively, not only the retrieval precision but also the retrieval efficiency is emphasized in the large-scale RSIR scenario. Therefore, the approximate nearest neighborhood (ANN) search attracts the researchers’ attention increasingly. In this paper, we propose a new hash learning method, named semi-supervised deep adversarial hashing (SDAH), to accomplish the ANN for the large-scale RSIR task. The assumption of our model is that the RS images have been represented by the proper visual features. First, a residual auto-encoder (RAE) is developed to generate the class variable and hash code. Second, two multi-layer networks are constructed to regularize the obtained latent vectors using the prior distribution. These two modules mentioned are integrated under the generator adversarial framework. Through the minimax learning, the class variable would be a one-hot-like vector while the hash code would be the binary-like vector. Finally, a specific hashing function is formulated to enhance the quality of the generated hash code. The effectiveness of the hash codes learned by our SDAH model was proved by the positive experimental results counted on three public RS image archives. Compared with the existing hash learning methods, the proposed method reaches improved performance.
APA, Harvard, Vancouver, ISO, and other styles
11

Peng, Jiang-Zhou, Xianglei Liu, Zhen-Dong Xia, Nadine Aubry, Zhihua Chen, and Wei-Tao Wu. "Data-Driven Modeling of Geometry-Adaptive Steady Heat Convection Based on Convolutional Neural Networks." Fluids 6, no. 12 (December 1, 2021): 436. http://dx.doi.org/10.3390/fluids6120436.

Full text
Abstract:
Heat convection is one of the main mechanisms of heat transfer, and it involves both heat conduction and heat transportation by fluid flow; as a result, it usually requires numerical simulation for solving heat convection problems. Although the derivation of governing equations is not difficult, the solution process can be complicated and usually requires numerical discretization and iteration of differential equations. In this paper, based on neural networks, we developed a data-driven model for an extremely fast prediction of steady-state heat convection of a hot object with an arbitrary complex geometry in a two-dimensional space. According to the governing equations, the steady-state heat convection is dominated by convection and thermal diffusion terms; thus the distribution of the physical fields would exhibit stronger correlations between adjacent points. Therefore, the proposed neural network model uses convolutional neural network (CNN) layers as the encoder and deconvolutional neural network (DCNN) layers as the decoder. Compared with a fully connected (FC) network model, the CNN-based model is good for capturing and reconstructing the spatial relationships of low-rank feature spaces, such as edge intersections, parallelism, and symmetry. Furthermore, we applied the signed distance function (SDF) as the network input for representing the problem geometry, which contains more information compared with a binary image. For displaying the strong learning and generalization ability of the proposed network model, the training dataset only contains hot objects with simple geometries: triangles, quadrilaterals, pentagons, hexagons, and dodecagons, while the testing cases use arbitrary and complex geometries. According to the study, the trained network model can accurately predict the velocity and temperature field of the problems with complex geometries, which has never been seen by the network model during the model training; and the prediction speed is two orders faster than the CFD. The ability of accurate and extremely fast prediction of the network model suggests the potential of applying reduced-order network models to the applications of real-time control and fast optimization in the future.
APA, Harvard, Vancouver, ISO, and other styles
12

Segal, Yoram, and Ofer Hadar. "Covert channel implementation using motion vectors over H.264 compression." Revista de Estudos e Pesquisas Avançadas do Terceiro Setor 2, no. 2 (August 18, 2019): 111. http://dx.doi.org/10.31501/repats.v2i2.10567.

Full text
Abstract:
Embedding information inside video streaming is a hot topic in the world of video broadcasting. Information assimilation can be used for positive purposes, such as copyright protection. On the other hand, it can be used for malicious purposes such as a hostile takeover, remotely, on end-user devices. The basic idea of information assimilation technology within a video is to take advantage of the sequence of frames that flows between the video server and the viewer. Casting foreigner data into each frame such a hidden communication channel is created namely - covert channel. Attackers find the multimedia world in general and video streaming, an attractive backdoor for cyber-attacks. Multimedia covert channels provide reasonable bandwidth and long-lasting transmission streams, suitable for planting malicious information and therefore used as an exploit alternative. In this article, we propose a method to protect against attacks that use video payload for transferring confidential data using a covert channel. This work is part of a large-scale study of video attack methods. The goal of the study is to build a generic platform that will investigate the reliability of video sequences. The platform allows to encoding and decoding video. A plugin can be added to each encoder or decoder. Each plugin is an algorithm that is studied and developed in the framework of this study. One of the algorithms in this platform is information transmission over video using motion vectors. This method is the topic off this article.
APA, Harvard, Vancouver, ISO, and other styles
13

Norhalimi, Muhammad, and Taghfirul Azhima Yoga Siswa. "Optimasi Seleksi Fitur Information Gain pada Algoritma Naïve Bayes dan K-Nearest Neighbor." JISKA (Jurnal Informatika Sunan Kalijaga) 7, no. 3 (September 25, 2022): 237–55. http://dx.doi.org/10.14421/jiska.2022.7.3.237-255.

Full text
Abstract:
There was an increase in the number of late payments of tuition fees by 3,018 from a total of 5,535 students at the end of 2020. This study uses the Python library which requires data to be of numeric type, so it requires data transformation according to the type of data in the study, data that has a scale is transformed using an ordinal encoder, and data that does not have a scale is transformed using one-hot encoding. The purpose of this study was to evaluate the performance of the Naïve Bayes algorithm and K-Nearest Neighbor with a confusion matrix in predicting late payment of tuition fees at UMKT. The dataset used in this study was sourced from the financial administration bureau as many as 12,408 data with a distribution of 90:10. Based on the results of the calculation of the selection of information gain features, the best 4 attributes that influence the research are obtained, namely faculty, study program, class, and gender. The results of the evaluation of the confusion matrix that have the best performance using the Naïve Bayes with information gain algorithm obtain an accuracy of 55.19%, while the K-Nearest Neighbor with information gain only obtains an accuracy of 50.76%. Based on the accuracy results obtained in the prediction of late payment of tuition fees by using attributes derived from information gain, it influences increasing the accuracy of Naïve Bayes, but the use of the information gain attribute on the K-Nearest Neighbor algorithm makes the accuracy obtained decrease.
APA, Harvard, Vancouver, ISO, and other styles
14

Yang, Xu, Jianguo Chen, and Zhijun Chen. "Classification of Alteration Zones Based on Drill Core Hyperspectral Data Using Semi-Supervised Adversarial Autoencoder: A Case Study in Pulang Porphyry Copper Deposit, China." Remote Sensing 15, no. 4 (February 15, 2023): 1059. http://dx.doi.org/10.3390/rs15041059.

Full text
Abstract:
With the development of hyperspectral technology, it has become possible to classify alteration zones using hyperspectral data. Since various altered rocks are comprehensive manifestations of mineral assemblages, their spectra are highly similar, which greatly increases the difficulty of distinguishing among them. In this study, a Semi-Supervised Adversarial Autoencoder (SSAAE) was proposed to classify the alteration zones, using the drill core hyperspectral data collected from the Pulang porphyry copper deposit. The multiscale feature extractor was first integrated into the encoder to fully exploit and mine the latent feature representations of hyperspectral data, which were further transformed into discrete class vectors using a classifier. Second, the decoder reconstructed the original inputs with the latent and class vectors. Third, we imposed a categorical distribution on the discrete class vectors represented in the one-hot form using the adversarial regularization process and incorporated the supervised classification process into the network to better guide the network training using the limited labeled data. The comparison experiments on the synthetic dataset and measured hyperspectral dataset were conducted to quantitatively and qualitatively certify the effect of the proposed method. The results show that the SSAAE outperformed six other methods for classifying alteration zones. Moreover, we further displayed the delineated results of the SSAAE on the cross-section, in which the alteration zones were sensible from a geological point of view and had good spatial consistency with the occurrence of Cu, which further demonstrates that the SSAAE had good applicability for the classification of alteration zones.
APA, Harvard, Vancouver, ISO, and other styles
15

Kou, Liang, Shanshuo Ding, Ting Wu, Wei Dong, and Yuyu Yin. "An Intrusion Detection Model for Drone Communication Network in SDN Environment." Drones 6, no. 11 (November 4, 2022): 342. http://dx.doi.org/10.3390/drones6110342.

Full text
Abstract:
Drone communication is currently a hot topic of research, and the use of drones can easily set up communication networks in areas with complex terrain or areas subject to disasters and has broad application prospects. One of the many challenges currently facing drone communication is the communication security issue. Drone communication networks generally use software defined network (SDN) architectures, and SDN controllers can provide reliable data forwarding control for drone communication networks, but they are also highly susceptible to attacks and pose serious security threats to drone networks. In order to solve the security problem, this paper proposes an intrusion detection model that can reach the convergence state quickly. The model consists of a deep auto-encoder (DAE), a convolutional neural network (CNN), and an attention mechanism. DAE is used to reduce the original data dimensionality and improve the training efficiency, CNN is used to extract the data features, the attention mechanism is used to enhance the important features of the data, and finally the traffic is detected and classified. We conduct tests using the InSDN dataset, which is collected from an SDN environment and is able to verify the effectiveness of the model on SDN traffic. The experiments utilize the Tensorflow framework to build a deep learning model structure, which is run on the Jupyter Notebook platform in the Anaconda environment. Compared with the CNN model, the LSTM model, and the CNN+LSTM hybrid model, the accuracy of this model in binary classification experiments is 99.7%, which is about 0.6% higher than other comparison models. The accuracy of the model in the multiclassification experiment is 95.5%, which is about 3% higher than other comparison models. Additionally, it only needs 20 to 30 iterations to converge, which is only one-third of other models. The experiment proves that the model has fast convergence speed and high precision and is an effective detection method.
APA, Harvard, Vancouver, ISO, and other styles
16

Hadjicostis, Christoforos N. "Periodic and non-concurrent error detection and identification in one-hot encoded FSMs." Automatica 40, no. 10 (October 2004): 1665–76. http://dx.doi.org/10.1016/j.automatica.2004.05.005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

ZHAO Chen-guang, 赵晨光, 周次明 ZHOU Ci-ming, 庞彦东 PANG Yan-dong, 范典 FAN Dian, 陈希 CHEN Xi, 刘涵洁 LIU Han-jie, 周卿 ZHOU Qing, and 李宇潇 LI Yu-xiao. "Phase Compensation Method of Fizeau Interference Demodulation Based on One-hot Encoded Finite State Machine." ACTA PHOTONICA SINICA 49, no. 5 (2020): 506001. http://dx.doi.org/10.3788/gzxb20204905.0506001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

KOCAK, TASKIN, GEORGE R. HARRIS, and RONALD F. DEMARA. "SELF-TIMED ARCHITECTURE FOR MASKED SUCCESSIVE APPROXIMATION ANALOG-TO-DIGITAL CONVERSION." Journal of Circuits, Systems and Computers 16, no. 01 (February 2007): 1–14. http://dx.doi.org/10.1142/s0218126607003551.

Full text
Abstract:
In this paper, a novel architecture for self-timed analog-to-digital conversion is presented and designed using the NULL Convention Logic (NCL) paradigm. This analog-to-digital converter (ADC) employs successive approximation and a one-hot encoded masking technique to digitize analog signals. The architecture scales readily to any given resolution by utilizing the one-hot encoded scheme to permit identical logical components for each bit of resolution. The four-bit configuration of the proposed design has been implemented and assessed via simulation in 0.18-μm CMOS technology. Furthermore, the ADC may be interfaced with either synchronous or four-phase asynchronous digital systems.
APA, Harvard, Vancouver, ISO, and other styles
19

Graham, Sheila. "It's an RNA world: a focus on recent advances in RNA biochemistry." Biochemist 38, no. 2 (April 1, 2016): 4–7. http://dx.doi.org/10.1042/bio03802004.

Full text
Abstract:
RNA is a fascinating molecule. Its array of different properties is highlighted by our knowledge of the ribosome. RNA can have structural properties; for example, rRNA is the core of the ribosome. RNA can bind proteins; for example, rRNA–ribosomal protein interactions are used to build the protein translation machinery. Finally, RNA can display enzymatic catalysis. In the ribosome during translation, non-coding RNA carries out decoding (tRNA) and amino acid polymerization (rRNA). If this is not fascinating enough, the last decade or so has seen a considerable reassessment of the core of Francis Crick's ‘central dogma of molecular biology’ that states that RNA molecules (rRNAs, tRNAs and mRNAs) serve to drive protein synthesis, decode mRNAs or act as a templates encoding protein. Much of the upheaval in our understanding of RNA biology has come from deep mining of the human transcriptome by RNA sequencing (RNAseq) by next generation sequencing techniques. One of the most startling revelations from the wealth of new data provided by the ‘-omics’ revolution is that over 80% of the human genome encodes RNA, whereas only up to 2% encodes proteins. In other words, our genomes are largely RNA-coding. The discovery of the plethora of non-coding RNAs in our genomes has revolutionized molecular biology. These RNAs do not encode protein and, unlike rRNAs or tRNAs, most are not intimately linked to protein translation. In this edition of The Biochemist, we revisit recent advances in RNA research to reveal the broad scope of this hot topic in today's biochemistry and to spotlight some new areas of RNA research.
APA, Harvard, Vancouver, ISO, and other styles
20

Park, Sangyong, Jaeseon Kim, and Yong Seok Heo. "Semantic Segmentation Using Pixel-Wise Adaptive Label Smoothing via Self-Knowledge Distillation for Limited Labeling Data." Sensors 22, no. 7 (March 29, 2022): 2623. http://dx.doi.org/10.3390/s22072623.

Full text
Abstract:
To achieve high performance, most deep convolutional neural networks (DCNNs) require a significant amount of training data with ground truth labels. However, creating ground-truth labels for semantic segmentation requires more time, human effort, and cost compared with other tasks such as classification and object detection, because the ground-truth label of every pixel in an image is required. Hence, it is practically demanding to train DCNNs using a limited amount of training data for semantic segmentation. Generally, training DCNNs using a limited amount of data is problematic as it easily results in a decrease in the accuracy of the networks because of overfitting to the training data. Here, we propose a new regularization method called pixel-wise adaptive label smoothing (PALS) via self-knowledge distillation to stably train semantic segmentation networks in a practical situation, in which only a limited amount of training data is available. To mitigate the problem caused by limited training data, our method fully utilizes the internal statistics of pixels within an input image. Consequently, the proposed method generates a pixel-wise aggregated probability distribution using a similarity matrix that encodes the affinities between all pairs of pixels. To further increase the accuracy, we add one-hot encoded distributions with ground-truth labels to these aggregated distributions, and obtain our final soft labels. We demonstrate the effectiveness of our method for the Cityscapes dataset and the Pascal VOC2012 dataset using limited amounts of training data, such as 10%, 30%, 50%, and 100%. Based on various quantitative and qualitative comparisons, our method demonstrates more accurate results compared with previous methods. Specifically, for the Cityscapes test set, our method achieved mIoU improvements of 0.076%, 1.848%, 1.137%, and 1.063% for 10%, 30%, 50%, and 100% training data, respectively, compared with the method of the cross-entropy loss using one-hot encoding with ground truth labels.
APA, Harvard, Vancouver, ISO, and other styles
21

Wemheuer, Bernd, Robert Taube, Pinar Akyol, Franziska Wemheuer, and Rolf Daniel. "Microbial Diversity and Biochemical Potential Encoded by Thermal Spring Metagenomes Derived from the Kamchatka Peninsula." Archaea 2013 (2013): 1–13. http://dx.doi.org/10.1155/2013/136714.

Full text
Abstract:
Volcanic regions contain a variety of environments suitable for extremophiles. This study was focused on assessing and exploiting the prokaryotic diversity of two microbial communities derived from different Kamchatkian thermal springs by metagenomic approaches. Samples were taken from a thermoacidophilic spring near the Mutnovsky Volcano and from a thermophilic spring in the Uzon Caldera. Environmental DNA for metagenomic analysis was isolated from collected sediment samples by direct cell lysis. The prokaryotic community composition was examined by analysis of archaeal and bacterial 16S rRNA genes. A total number of 1235 16S rRNA gene sequences were obtained and used for taxonomic classification. Most abundant in the samples were members ofThaumarchaeota,Thermotogae, andProteobacteria. The Mutnovsky hot spring was dominated by the Terrestrial Hot Spring Group,Kosmotoga, andAcidithiobacillus. The Uzon Caldera was dominated by uncultured members of the Miscellaneous Crenarchaeotic Group andEnterobacteriaceae. The remaining 16S rRNA gene sequences belonged to theAquificae,Dictyoglomi,Euryarchaeota,Korarchaeota,Thermodesulfobacteria,Firmicutes, and some potential new phyla. In addition, the recovered DNA was used for generation of metagenomic libraries, which were subsequently mined for genes encoding lipolytic and proteolytic enzymes. Three novel genes conferring lipolytic and one gene conferring proteolytic activity were identified.
APA, Harvard, Vancouver, ISO, and other styles
22

Almajid, Adi Sakti. "Multilayer Perceptron Optimization on Imbalanced Data Using SVM-SMOTE and One-Hot Encoding for Credit Card Default Prediction." Journal of Advances in Information Systems and Technology 3, no. 2 (September 6, 2022): 67–74. http://dx.doi.org/10.15294/jaist.v3i2.57061.

Full text
Abstract:
Credit risk assessment analysis by classifying potential users is an important process to reduce the occurrence of default users. The problems faced from the classification process using real-world datasets are imbalanced data that causes bias-to-majority in model training outcomes. These problems cause the algorithm to only focus on the majority class and ignore the minority class, even though both classes have the same important role. To overcome this problem, a combination of One-hot encoding (OHE) and SVM-Synthetic minority oversampling technique (SVM-SMOTE) techniques are used for the optimization process of the MLP classification algorithm. OHE is used to encode values categorical nominal and SVM-SMOTE for the oversampling. The results of the measurement of the ability of the model generated from the optimized MLP are then compared with the baseline using the AUC score. The data used is the default of credit card client dataset from Taiwan which has 30000 instances. The result of the highest AUC score of the MLP that has gone through optimization is 0.7184, an increase of 0.2179 compared to the baseline.
APA, Harvard, Vancouver, ISO, and other styles
23

Touska, Filip, Brian Turnquist, Viktorie Vlachova, Peter W. Reeh, Andreas Leffler, and Katharina Zimmermann. "Heat-resistant action potentials require TTX-resistant sodium channels NaV1.8 and NaV1.9." Journal of General Physiology 150, no. 8 (July 3, 2018): 1125–44. http://dx.doi.org/10.1085/jgp.201711786.

Full text
Abstract:
Damage-sensing nociceptors in the skin provide an indispensable protective function thanks to their specialized ability to detect and transmit hot temperatures that would block or inflict irreversible damage in other mammalian neurons. Here we show that the exceptional capacity of skin C-fiber nociceptors to encode noxiously hot temperatures depends on two tetrodotoxin (TTX)-resistant sodium channel α-subunits: NaV1.8 and NaV1.9. We demonstrate that NaV1.9, which is commonly considered an amplifier of subthreshold depolarizations at 20°C, undergoes a large gain of function when temperatures rise to the pain threshold. We also show that this gain of function renders NaV1.9 capable of generating action potentials with a clear inflection point and positive overshoot. In the skin, heat-resistant nociceptors appear as two distinct types with unique and possibly specialized features: one is blocked by TTX and relies on NaV1.9, and the second type is insensitive to TTX and composed of both NaV1.8 and NaV1.9. Independent of rapidly gated TTX-sensitive NaV channels that form the action potential at pain threshold, NaV1.8 is required in all heat-resistant nociceptors to encode temperatures higher than ∼46°C, whereas NaV1.9 is crucial for shaping the action potential upstroke and keeping the NaV1.8 voltage threshold within reach.
APA, Harvard, Vancouver, ISO, and other styles
24

Zhang, Ping, and Fang Hu. "Analysis on the Way and Potential of Economic Low-Carbon Development of China Based on Genetic Algorithm." Mathematical Problems in Engineering 2022 (July 5, 2022): 1–7. http://dx.doi.org/10.1155/2022/1587251.

Full text
Abstract:
How a developing China can meet the challenges of the post-Kyoto era in the process of rapid industrialization is a hot issue in current academic research. Facing the pressure of the international community to reduce emissions and the energy and resource constraints under the development trend of the heavy chemical industry, China can only turn the pressure into a driving force and seek a low-carbon development path. This paper proposes a prediction model for China’s low-carbon economic development based on the combined model of genetic algorithm (GA) and long short-term memory neural network (LSTM). The data are encoded with one-hot, embedding is used to reduce the dimension, and the genetic algorithm is used to obtain the optimal hyperparameters of the LSTM model to improve the accuracy of the model. The results show that the model accuracy remains above 90%.
APA, Harvard, Vancouver, ISO, and other styles
25

Qin, Xiwen, Dongmei Yin, Xiaogang Dong, Dongxue Chen, and Shuang Zhang. "Survival prediction model for right-censored data based on improved composite quantile regression neural network." Mathematical Biosciences and Engineering 19, no. 8 (2022): 7521–42. http://dx.doi.org/10.3934/mbe.2022354.

Full text
Abstract:
<abstract> <p>With the development of the field of survival analysis, statistical inference of right-censored data is of great importance for the study of medical diagnosis. In this study, a right-censored data survival prediction model based on an improved composite quantile regression neural network framework, called rcICQRNN, is proposed. It incorporates composite quantile regression with the loss function of a multi-hidden layer feedforward neural network, combined with an inverse probability weighting method for survival prediction. Meanwhile, the hyperparameters involved in the neural network are adjusted using the WOA algorithm, integer encoding and One-Hot encoding are implemented to encode the classification features, and the BWOA variable selection method for high-dimensional data is proposed. The rcICQRNN algorithm was tested on a simulated dataset and two real breast cancer datasets, and the performance of the model was evaluated by three evaluation metrics. The results show that the rcICQRNN-5 model is more suitable for analyzing simulated datasets. The One-Hot encoding of the WOA-rcICQRNN-30 model is more applicable to the NKI70 data. The model results are optimal for $ k = 15 $ after feature selection for the METABRIC dataset. Finally, we implemented the method for cross-dataset validation. On the whole, the Cindex results using One-Hot encoding data are more stable, making the proposed rcICQRNN prediction model flexible enough to assist in medical decision making. It has practical applications in areas such as biomedicine, insurance actuarial and financial economics.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
26

Ge, Feng-Xiang, Yanyu Bai, Mengjia Li, Guangping Zhu, and Jingwei Yin. "Label distribution-guided transfer learning for underwater source localization." Journal of the Acoustical Society of America 151, no. 6 (June 2022): 4140–49. http://dx.doi.org/10.1121/10.0011741.

Full text
Abstract:
Underwater source localization by deep neural networks (DNNs) is challenging since training these DNNs generally requires a large amount of experimental data and is computationally expensive. In this paper, label distribution-guided transfer learning (LD-TL) for underwater source localization is proposed, where a one-dimensional convolutional neural network (1D-CNN) is pre-trained with the simulation data generated by an underwater acoustic propagation model and then fine-tuned with a very limited amount of experimental data. In particular, the experimental data for fine-tuning the pre-trained 1D-CNN are labeled with label distribution vectors instead of one-hot encoded vectors. Experimental results show that the performance of underwater source localization with a very limited amount of experimental data is significantly improved by the proposed LD-TL.
APA, Harvard, Vancouver, ISO, and other styles
27

Wood, Laura A., Gabrielle Larocque, Nicholas I. Clarke, Sourav Sarkar, and Stephen J. Royle. "New tools for “hot-wiring” clathrin-mediated endocytosis with temporal and spatial precision." Journal of Cell Biology 216, no. 12 (September 27, 2017): 4351–65. http://dx.doi.org/10.1083/jcb.201702188.

Full text
Abstract:
Clathrin-mediated endocytosis (CME) is the major route of receptor internalization at the plasma membrane. Analysis of constitutive CME is difficult because the initiation of endocytic events is unpredictable. When and where a clathrin-coated pit will form and what cargo it will contain are difficult to foresee. Here we describe a series of genetically encoded reporters that allow the initiation of CME on demand. A clathrin-binding protein fragment (“hook”) is inducibly attached to an “anchor” protein at the plasma membrane, which triggers the formation of new clathrin-coated vesicles. Our design incorporates temporal and spatial control by the use of chemical and optogenetic methods for inducing hook–anchor attachment. Moreover, the cargo is defined. Because several steps in vesicle creation are bypassed, we term it “hot-wiring.” We use hot-wired endocytosis to describe the functional interactions between clathrin and AP2. Two distinct sites on the β2 subunit, one on the hinge and the other on the appendage, are necessary and sufficient for functional clathrin engagement.
APA, Harvard, Vancouver, ISO, and other styles
28

Yang, Sibo, Shusheng Wang, Lanyin Sun, Zhongxuan Luo, and Yuan Bao. "Output Layer Structure Optimization for Weighted Regularized Extreme Learning Machine Based on Binary Method." Symmetry 15, no. 1 (January 16, 2023): 244. http://dx.doi.org/10.3390/sym15010244.

Full text
Abstract:
In this paper, we focus on the redesign of the output layer for the weighted regularized extreme learning machine (WRELM). For multi-classification problems, the conventional method of the output layer setting, named “one-hot method”, is as follows: Let the class of samples be r; then, the output layer node number is r and the ideal output of s-th class is denoted by the s-th unit vector in Rr (1≤s≤r). Here, in this article, we propose a “binarymethod” to optimize the output layer structure: Let 2p−1<r≤2p, where p≥2, and p output nodes are utilized and, simultaneously, the ideal outputs are encoded in binary numbers. In this paper, the binary method is employed in WRELM. The weights are updated through iterative calculation, which is the most important process in general neural networks. While in the extreme learning machine, the weight matrix is calculated in least square method. That is, the coefficient matrix of the linear equations we solved is symmetric. For WRELM, we continue this idea. And the main part of the weight-solving process is a symmetry matrix. Compared with the one-hot method, the binary method requires fewer output layer nodes, especially when the number of sample categories is high. Thus, some memory space can be saved when storing data. In addition, the number of weights connecting the hidden and the output layer will also be greatly reduced, which will directly reduce the calculation time in the process of training the network. Numerical experiments are conducted to prove that compared with the one-hot method, the binary method can reduce the output nodes and hidden-output weights without damaging the learning precision.
APA, Harvard, Vancouver, ISO, and other styles
29

Wegmann, Udo, Mary O'Connell-Motherway, Aldert Zomer, Girbe Buist, Claire Shearman, Carlos Canchaya, Marco Ventura, et al. "Complete Genome Sequence of the Prototype Lactic Acid Bacterium Lactococcus lactis subsp. cremoris MG1363." Journal of Bacteriology 189, no. 8 (February 16, 2007): 3256–70. http://dx.doi.org/10.1128/jb.01768-06.

Full text
Abstract:
ABSTRACT Lactococcus lactis is of great importance for the nutrition of hundreds of millions of people worldwide. This paper describes the genome sequence of Lactococcus lactis subsp. cremoris MG1363, the lactococcal strain most intensively studied throughout the world. The 2,529,478-bp genome contains 81 pseudogenes and encodes 2,436 proteins. Of the 530 unique proteins, 47 belong to the COG (clusters of orthologous groups) functional category “carbohydrate metabolism and transport,” by far the largest category of novel proteins in comparison with L. lactis subsp. lactis IL1403. Nearly one-fifth of the 71 insertion elements are concentrated in a specific 56-kb region. This integration hot-spot region carries genes that are typically associated with lactococcal plasmids and a repeat sequence specifically found on plasmids and in the “lateral gene transfer hot spot” in the genome of Streptococcus thermophilus. Although the parent of L. lactis MG1363 was used to demonstrate lysogeny in Lactococcus, L. lactis MG1363 carries four remnant/satellite phages and two apparently complete prophages. The availability of the L. lactis MG1363 genome sequence will reinforce its status as the prototype among lactic acid bacteria through facilitation of further applied and fundamental research.
APA, Harvard, Vancouver, ISO, and other styles
30

Gupta, Heena, and V. Asha. "Impact of Encoding of High Cardinality Categorical Data to Solve Prediction Problems." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4197–201. http://dx.doi.org/10.1166/jctn.2020.9044.

Full text
Abstract:
The prediction problem in any domain is very important to assess the prices and preferences among people. This issue varies for different kinds of data. Data may be nominal or ordinal, it may involve more categories or less. For any category to be considered by a machine learning algorithm, it needs to be encoded before any other operation can be further performed. There are various encoding schemes available like label encoding, count encoding and one hot encoding. This paper aims to understand the impact of various encoding schemes and the accuracy among the prediction problems of high cardinality categorical data. The paper also proposes an encoding scheme based on curated strings. The domain chosen for this purpose is predicting doctors’ fees in various cities having different profiles and qualification.
APA, Harvard, Vancouver, ISO, and other styles
31

Mukherjee, Sudipto, Himanshu Asnani, Eugene Lin, and Sreeram Kannan. "ClusterGAN: Latent Space Clustering in Generative Adversarial Networks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4610–17. http://dx.doi.org/10.1609/aaai.v33i01.33014610.

Full text
Abstract:
Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.
APA, Harvard, Vancouver, ISO, and other styles
32

B M, Dr Sagar, Dr Cauvery N K, Dr Padmashree T, and Dr Rajkumar R. "Rice and Wheat Yield Prediction in India Using Decision Tree and Random Forest." Computational Intelligence and Machine Learning 3, no. 2 (October 14, 2022): 1–8. http://dx.doi.org/10.36647/ciml/03.02.a001.

Full text
Abstract:
One of the main sources of revenue and growth in Indian economy is from agriculture. It is often a gamble for the farmers to obtain a decent yield, considering the unpredictable environmental conditions. This paper deals with the prediction of the yield of rice and wheat using machine learning algorithms using the annual crop yield production and the annual rainfall in the different districts of India. In this paper, a popular prediction model is developed using algorithms such as decision tree and random forest to predict the yield of most widely grown crops in India like rice and wheat. The features used were the area of production, rainfall, season and state. The season and the state were one hot encoded features. Mean square error was used to measure the loss. The dataset was prepared by combining the crop production in the various states and the rainfall dataset in the respective states. Index Terms : Machine Learning, XGBoost, Decision Tree, Random Forest, Data Preprocessing, Data Visualization, Prediction
APA, Harvard, Vancouver, ISO, and other styles
33

Yuan, Zhi Ling, Yi Ping Yuan, and Meng Yang. "The Research of Resource Scheduling Based on Genetic Algorithm." Key Engineering Materials 522 (August 2012): 799–803. http://dx.doi.org/10.4028/www.scientific.net/kem.522.799.

Full text
Abstract:
Job Shop scheduling problem, the essence of which is the resources scheduling problem, which has been proved to be a complete NP-hard problem. It has importantly realistic effect on further research, and has become a hot spot of research now. According to the practical Job Shop, as equipment resources are not unique, there are several machine tools with high frequency, while the number of that of low frequency is only one; the working procedure of processing components are also quite different, so we have put forward the Genetic Algorithm considering the sequence and, simultaneously, the machine choice. For reaching the shortest producing period, this method adopts Gemini string to encode, combining with the characteristics of the resources scheduling problem, and designs the unique way of Crossover and Mutation, meanwhile, it shows that the algorithm is effective through a specific example simulation analysis.
APA, Harvard, Vancouver, ISO, and other styles
34

Karimi, Younes, Anna Squicciarini, and Shomir Wilson. "Automated Detection of Doxing on Twitter." Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (November 7, 2022): 1–24. http://dx.doi.org/10.1145/3555167.

Full text
Abstract:
Doxing refers to the practice of disclosing sensitive personal information about a person without their consent. This form of cyberbullying is an unpleasant and sometimes dangerous phenomenon for online social networks. Although prior work exists on automated identification of other types of cyberbullying, a need exists for methods capable of detecting doxing on Twitter specifically. We propose and evaluate a set of approaches for automatically detecting second- and third-party disclosures on Twitter of sensitive private information, a subset of which constitutes doxing. We summarize our findings of common intentions behind doxing episodes and compare nine different approaches for automated detection based on string-matching and one-hot encoded heuristics, as well as word and contextualized string embedding representations of tweets. We identify an approach providing 96.86% accuracy and 97.37% recall using contextualized string embeddings and conclude by discussing the practicality of our proposed methods.
APA, Harvard, Vancouver, ISO, and other styles
35

Pei, Michael Y., and Stephen R. Clark. "Neural-Network Quantum States for Spin-1 Systems: Spin-Basis and Parameterization Effects on Compactness of Representations." Entropy 23, no. 7 (July 9, 2021): 879. http://dx.doi.org/10.3390/e23070879.

Full text
Abstract:
Neural network quantum states (NQS) have been widely applied to spin-1/2 systems, where they have proven to be highly effective. The application to systems with larger on-site dimension, such as spin-1 or bosonic systems, has been explored less and predominantly using spin-1/2 Restricted Boltzmann Machines (RBMs) with a one-hot/unary encoding. Here, we propose a more direct generalization of RBMs for spin-1 that retains the key properties of the standard spin-1/2 RBM, specifically trivial product states representations, labeling freedom for the visible variables and gauge equivalence to the tensor network formulation. To test this new approach, we present variational Monte Carlo (VMC) calculations for the spin-1 anti-ferromagnetic Heisenberg (AFH) model and benchmark it against the one-hot/unary encoded RBM demonstrating that it achieves the same accuracy with substantially fewer variational parameters. Furthermore, we investigate how the hidden unit complexity of NQS depend on the local single-spin basis used. Exploiting the tensor network version of our RBM we construct an analytic NQS representation of the Affleck-Kennedy-Lieb-Tasaki (AKLT) state in the xyz spin-1 basis using only M=2N hidden units, compared to M∼O(N2) required in the Sz basis. Additional VMC calculations provide strong evidence that the AKLT state in fact possesses an exact compact NQS representation in the xyz basis with only M=N hidden units. These insights help to further unravel how to most effectively adapt the NQS framework for more complex quantum systems.
APA, Harvard, Vancouver, ISO, and other styles
36

Chen, Wei, Lei Chen, and Qi Dai. "iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach." Computational and Mathematical Methods in Medicine 2021 (October 11, 2021): 1–10. http://dx.doi.org/10.1155/2021/7681497.

Full text
Abstract:
Membrane protein is an important kind of proteins. It plays essential roles in several cellular processes. Based on the intramolecular arrangements and positions in a cell, membrane proteins can be divided into several types. It is reported that the types of a membrane protein are highly related to its functions. Determination of membrane protein types is a hot topic in recent years. A plenty of computational methods have been proposed so far. Some of them used functional domain information to encode proteins. However, this procedure was still crude. In this study, we designed a novel feature extraction scheme to obtain informative features of proteins from their functional domain information. Such scheme termed domains as words and proteins, represented by its domains, as sentences. The natural language processing approach, word2vector, was applied to access the features of domains, which were further refined to protein features. Based on these features, RAndom k-labELsets with random forest as the base classifier was employed to build the multilabel classifier, namely, iMPT-FDNPL. The tenfold cross-validation results indicated the good performance of such classifier. Furthermore, such classifier was superior to other classifiers based on features derived from functional domains via one-hot scheme or derived from other properties of proteins, suggesting the effectiveness of protein features generated by the proposed scheme.
APA, Harvard, Vancouver, ISO, and other styles
37

Малов, Igor Malov, Малов, Sergey Malov, Степаненко, Liliya Stepanenko, Парамонов, et al. "reconstructIon of recombination sites in genomic structures of the strains of genotype 6 of hepatitis c virus." Бюллетень Восточно-Сибирского научного центра Сибирского отделения Российской академии медицинских наук 1, no. 5 (December 6, 2016): 100–103. http://dx.doi.org/10.12737/23401.

Full text
Abstract:
The encoded portion of the complete genomes of 46 strains of the genotype 6 of hepatitis C virus through bioinformat-ics RDP programs complex group of 6 recombinants strains was identified, in which 7 recombination sites were fixed. Strains correspond to the three-recombinant HCV subtypes: 6a, 6b and 6I. For each of the identified recombinant we defined parent strains from which they can be obtained. Three recombinants were obtained from parent strains of the same subtype (homologous inside subgenotypic recombination). For the remaining three recombinants parent strains were members of three different subtypes (between subgenotypic recombination). In one strain we identified a unique recombination site in a highly conservative NS3 gene. Most of the recombination sites occurred in the region of the structural genes C, E1 and E2, and in the area of non-structural genes NS5a and NS5b. In the recombinant strain DQ480518-6a two recombination site were identified. One site is located in the structural and nonstructural genes (E2 + NS1 + NS2), and a second one in non-structural region. Dimensions of recombination sites can vary from 86 to 1072 nucleotide bases. The study identified “hot spots” of recombination in the strains of genotype 6 of hepatitis C virus. The recombinants were found in the population of the three countries: the United States (from the serum of an immigrant), Hong Kong and China.The encoded portion of the complete genomes of 46 strains of the genotype 6 of hepatitis C virus through bioinformat-ics RDP programs complex group of 6 recombinants strains was identified, in which 7 recombination sites were fixed. Strains correspond to the three-recombinant HCV subtypes: 6a, 6b and 6I. For each of the identified recombinant we defined parent strains from which they can be obtained. Three recombinants were obtained from parent strains of the same subtype (homologous inside subgenotypic recombination). For the remaining three recombinants parent strains were members of three different subtypes (between subgenotypic recombination). In one strain we identified a unique recombination site in a highly conservative NS3 gene. Most of the recombination sites occurred in the region of the structural genes C, E1 and E2, and in the area of non-structural genes NS5a and NS5b. In the recombinant strain DQ480518-6a two recombination site were identified. One site is located in the structural and nonstructural genes (E2 + NS1 + NS2), and a second one in non-structural region. Dimensions of recombination sites can vary from 86 to 1072 nucleotide bases. The study identified “hot spots” of recombination in the strains of genotype 6 of hepatitis C virus. The recombinants were found in the population of the three countries: the United States (from the serum of an immigrant), Hong Kong and China.
APA, Harvard, Vancouver, ISO, and other styles
38

Zimbeck, Alicia J., Naureen Iqbal, Angela M. Ahlquist, Monica M. Farley, Lee H. Harrison, Tom Chiller, and Shawn R. Lockhart. "FKS Mutations and Elevated Echinocandin MIC Values among Candida glabrata Isolates from U.S. Population-Based Surveillance." Antimicrobial Agents and Chemotherapy 54, no. 12 (September 13, 2010): 5042–47. http://dx.doi.org/10.1128/aac.00836-10.

Full text
Abstract:
ABSTRACT Candida glabrata is the second leading cause of candidemia in the United States. Its high-level resistance to triazole antifungal drugs has led to the increased use of the echinocandin class of antifungal agents for primary therapy of these infections. We monitored C. glabrata bloodstream isolates from a population-based surveillance study for elevated echinocandin MIC values (MICs of ≥0.25 μg/ml). From the 490 C. glabrata isolates that were screened, we identified 16 isolates with an elevated MIC value (2.9% of isolates from Atlanta and 2.0% of isolates from Baltimore) for one or more of the echinocandin drugs caspofungin, anidulafungin, and micafungin. All of the isolates with elevated MIC values had a mutation in the previously identified hot spot 1 of either the glucan synthase FKS1 (n = 2) or FKS2 (n = 14) gene. No mutations were detected in hot spot 2 of either FKS1 or FKS2. The predominant mutation was mutation of FKS2-encoded serine 663 to proline (S663P), found in 10 of the isolates with elevated echinocandin MICs. Two of the mutations, R631G for FKS1 and R665G for FKS2, have not been reported previously for C. glabrata. Multilocus sequence typing indicated that the predominance of the S663P mutation was not due to the clonal spread of a single sequence type. With a rising number of echinocandin therapy failures reported, it is important to continue to monitor rates of elevated echinocandin MIC values and the associated mutations.
APA, Harvard, Vancouver, ISO, and other styles
39

Carpenter, Kristy, Alexander Pilozzi, and Xudong Huang. "A Pilot Study of Multi-Input Recurrent Neural Networks for Drug-Kinase Binding Prediction." Molecules 25, no. 15 (July 24, 2020): 3372. http://dx.doi.org/10.3390/molecules25153372.

Full text
Abstract:
The use of virtual drug screening can be beneficial to research teams, enabling them to narrow down potentially useful compounds for further study. A variety of virtual screening methods have been developed, typically with machine learning classifiers at the center of their design. In the present study, we created a virtual screener for protein kinase inhibitors. Experimental compound–target interaction data were obtained from the IDG-DREAM Drug-Kinase Binding Prediction Challenge. These data were converted and fed as inputs into two multi-input recurrent neural networks (RNNs). The first network utilized data encoded in one-hot representation, while the other incorporated embedding layers. The models were developed in Python, and were designed to output the IC50 of the target compounds. The performance of the models was assessed primarily through analysis of the Q2 values produced from runs of differing sample and epoch size; recorded loss values were also reported and graphed. The performance of the models was limited, though multiple changes are proposed for potential improvement of a multi-input recurrent neural network-based screening tool.
APA, Harvard, Vancouver, ISO, and other styles
40

Cheng, Xin, Jun Wang, Qianyue Li, and Taigang Liu. "BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters." Molecules 26, no. 24 (December 7, 2021): 7414. http://dx.doi.org/10.3390/molecules26247414.

Full text
Abstract:
An important reason of cancer proliferation is the change in DNA methylation patterns, characterized by the localized hypermethylation of the promoters of tumor-suppressor genes together with an overall decrease in the level of 5-methylcytosine (5mC). Therefore, identifying the 5mC sites in the promoters is a critical step towards further understanding the diverse functions of DNA methylation in genetic diseases such as cancers and aging. However, most wet-lab experimental techniques are often time consuming and laborious for detecting 5mC sites. In this study, we proposed a deep learning-based approach, called BiLSTM-5mC, for accurately identifying 5mC sites in genome-wide DNA promoters. First, we randomly divided the negative samples into 11 subsets of equal size, one of which can form the balance subset by combining with the positive samples in the same amount. Then, two types of feature vectors encoded by the one-hot method, and the nucleotide property and frequency (NPF) methods were fed into a bidirectional long short-term memory (BiLSTM) network and a full connection layer to train the 22 submodels. Finally, the outputs of these models were integrated to predict 5mC sites by using the majority vote strategy. Our experimental results demonstrated that BiLSTM-5mC outperformed existing methods based on the same independent dataset.
APA, Harvard, Vancouver, ISO, and other styles
41

Garroussi, Zineb, Rachid Ellaia, El-Ghazali Talbi, and Jean-Yves Lucas. "A hybrid non-dominated sorting genetic algorithm for a multi-objective demand-side management problem in a smart building." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 1 (February 1, 2020): 559. http://dx.doi.org/10.11591/ijece.v10i1.pp559-574.

Full text
Abstract:
One of the most significant challenges facing optimization models for the demand-side management (DSM) is obtaining feasible solutions in a shorter time. In this paper, the DSM is formulated in a smart building as a linear constrained multi-objective optimization model to schedule both electrical and thermal loads over one day. Two objectives are considered, energy cost and discomfort caused by allowing flexibility of loads within an acceptable comfort range. To solve this problem, an integrative matheuristic is proposed by combining a multi-objective evolutionary algorithm as a master level with an exact solver as a slave level. To cope with the non-triviality of feasible solutions representation and NP-hardness of our optimization model, in this approach discrete decision variables are encoded as partial chromosomes and the continuous decision variables are determined optimally by an exact solver. This matheuristic is relevant for dealing with the constraints of our optimization model. To validate the performance of our approach, a number of simulations are performed and compared with the goal programming under various scenarios of cold and hot weather conditions. It turns out that our approach outperforms the goal programming with respect to some comparison metrics including the hypervolume difference, epsilon indicator, number of the Pareto solutions found, and computational time metrics.
APA, Harvard, Vancouver, ISO, and other styles
42

Da’u, Aminu, and Naomie Salim. "Aspect extraction on user textual reviews using multi-channel convolutional neural network." PeerJ Computer Science 5 (May 6, 2019): e191. http://dx.doi.org/10.7717/peerj-cs.191.

Full text
Abstract:
Aspect extraction is a subtask of sentiment analysis that deals with identifying opinion targets in an opinionated text. Existing approaches to aspect extraction typically rely on using handcrafted features, linear and integrated network architectures. Although these methods can achieve good performances, they are time-consuming and often very complicated. In real-life systems, a simple model with competitive results is generally more effective and preferable over complicated models. In this paper, we present a multichannel convolutional neural network for aspect extraction. The model consists of a deep convolutional neural network with two input channels: a word embedding channel which aims to encode semantic information of the words and a part of speech (POS) tag embedding channel to facilitate the sequential tagging process. To get the vector representation of words, we initialized the word embedding channel and the POS channel using pretrained word2vec and one-hot-vector of POS tags, respectively. Both the word embedding and the POS embedding vectors were fed into the convolutional layer and concatenated to a one-dimensional vector, which is finally pooled and processed using a Softmax function for sequence labeling. We finally conducted a series of experiments using four different datasets. The results indicated better performance compared to the baseline models.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhong, Li, Qiuxiang Cheng, Xinli Tian, Liqian Zhao, and Zhongjun Qin. "Characterization of the Replication, Transfer, and Plasmid/Lytic Phage Cycle of the Streptomyces Plasmid-Phage pZL12." Journal of Bacteriology 192, no. 14 (May 14, 2010): 3747–54. http://dx.doi.org/10.1128/jb.00123-10.

Full text
Abstract:
ABSTRACT We report here the isolation and recombinational cloning of a large plasmid, pZL12, from endophytic Streptomyces sp. 9R-2. pZL12 comprises 90,435 bp, encoding 112 genes, 30 of which are organized in a large operon resembling bacteriophage genes. A replication locus (repA) and a conjugal transfer locus (traA-traC) were identified in pZL12. Surprisingly, the supernatant of a 9R-2 liquid culture containing partially purified phage particles infected 9R-2 cured of pZL12 (9R-2X) to form plaques, and a phage particle (φZL12) was observed by transmission electron microscopy. Major structural proteins (capsid, portal, and tail) of φZL12 virions were encoded by pZL12 genes. Like bacteriophage P1, linear φZL12 DNA contained ends from a largely random pZL12 sequence. There was also a hot end sequence in linear φZL12. φZL12 virions efficiently infected only one host, 9R-2X, but failed to infect and form plaques in 18 other Streptomyces strains. Some 9R-2X spores rescued from lysis by infection of φZL12 virions contained a circular pZL12 plasmid, completing a cycle comprising autonomous plasmid pZL12 and lytic phage φZL12. These results confirm pZL12 as the first example of a plasmid-phage in Streptomyces.
APA, Harvard, Vancouver, ISO, and other styles
44

Fernandes, Bruno, Alfonso González-Briones, Paulo Novais, Miguel Calafate, Cesar Analide, and José Neves. "An Adjective Selection Personality Assessment Method Using Gradient Boosting Machine Learning." Processes 8, no. 5 (May 21, 2020): 618. http://dx.doi.org/10.3390/pr8050618.

Full text
Abstract:
Goldberg’s 100 Unipolar Markers remains one of the most popular ways to measure personality traits, in particular, the Big Five. An important reduction was later preformed by Saucier, using a sub-set of 40 markers. Both assessments are performed by presenting a set of markers, or adjectives, to the subject, requesting him to quantify each marker using a 9-point rating scale. Consequently, the goal of this study is to conduct experiments and propose a shorter alternative where the subject is only required to identify which adjectives describe him the most. Hence, a web platform was developed for data collection, requesting subjects to rate each adjective and select those describing him the most. Based on a Gradient Boosting approach, two distinct Machine Learning architectures were conceived, tuned and evaluated. The first makes use of regressors to provide an exact score of the Big Five while the second uses classifiers to provide a binned output. As input, both receive the one-hot encoded selection of adjectives. Both architectures performed well. The first is able to quantify the Big Five with an approximate error of 5 units of measure, while the second shows a micro-averaged f1-score of 83%. Since all adjectives are used to compute all traits, models are able to harness inter-trait relationships, being possible to further reduce the set of adjectives by removing those that have smaller importance.
APA, Harvard, Vancouver, ISO, and other styles
45

Miao, Yan, Fu Liu, Tao Hou, and Yun Liu. "Virtifier: a deep learning-based identifier for viral sequences from metagenomes." Bioinformatics 38, no. 5 (December 15, 2021): 1216–22. http://dx.doi.org/10.1093/bioinformatics/btab845.

Full text
Abstract:
Abstract Motivation Viruses, the most abundant biological entities on earth, are important components of microbial communities, and as major human pathogens, they are responsible for human mortality and morbidity. The identification of viral sequences from metagenomes is critical for viral analysis. As massive quantities of short sequences are generated by next-generation sequencing, most methods utilize discrete and sparse one-hot vectors to encode nucleotide sequences, which are usually ineffective in viral identification. Results In this article, Virtifier, a deep learning-based viral identifier for sequences from metagenomic data is proposed. It includes a meaningful nucleotide sequence encoding method named Seq2Vec and a variant viral sequence predictor with an attention-based long short-term memory (LSTM) network. By utilizing a fully trained embedding matrix to encode codons, Seq2Vec can efficiently extract the relationships among those codons in a nucleotide sequence. Combined with an attention layer, the LSTM neural network can further analyze the codon relationships and sift the parts that contribute to the final features. Experimental results of three datasets have shown that Virtifier can accurately identify short viral sequences (&lt;500 bp) from metagenomes, surpassing three widely used methods, VirFinder, DeepVirFinder and PPR-Meta. Meanwhile, a comparable performance was achieved by Virtifier at longer lengths (&gt;5000 bp). Availability and implementation A Python implementation of Virtifier and the Python code developed for this study have been provided on Github https://github.com/crazyinter/Seq2Vec. The RefSeq genomes in this article are available in VirFinder at https://dx.doi.org/10.1186/s40168-017-0283-5. The CAMI Challenge Dataset 3 CAMI_high dataset in this article is available in CAMI at https://data.cami-challenge.org/participate. The real human gut metagenomes in this article are available at https://dx.doi.org/10.1101/gr.142315.112. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
46

Huang, Rui, Han Zhou, Tong Liu, and Hanlin Sheng. "Multi-UAV Collaboration to Survey Tibetan Antelopes in Hoh Xil." Drones 6, no. 8 (August 6, 2022): 196. http://dx.doi.org/10.3390/drones6080196.

Full text
Abstract:
Reducing the total mission time is essential in wildlife surveys owing to the dynamic movement of animals throughout their migrating environment and potentially extreme changes in weather. This paper proposed a multi-UAV path planning method for counting various flora and fauna populations, which can fully use the UAVs’ limited flight time to cover large areas. Unlike the current complete coverage path planning methods, based on sweep and polygon, our work encoded the path planning problem as the satisfiability modulo theory using a one-hot encoding scheme. Each instance generated a set of feasible paths at each iteration and recovered the set of shortest paths after sufficient time. We also flexibly optimized the paths based on the number of UAVs, endurance and camera parameters. We implemented the planning algorithm with four UAVs to conduct multiple photographic aerial wildlife surveys in areas around Zonag Lake, the birthplace of Tibetan antelope. Over 6 square kilometers was surveyed in about 2 h. In contrast, previous human-piloted single-drone surveys of the same area required over 4 days to complete. A generic few-shot detector that can perform effective counting without training on the target object is utilized in this paper, which can achieve an accuracy of over 97%.
APA, Harvard, Vancouver, ISO, and other styles
47

Zheng, Hai-Tao, Jin-Yuan Chen, Nan Liang, Arun Sangaiah, Yong Jiang, and Cong-Zhi Zhao. "A Deep Temporal Neural Music Recommendation Model Utilizing Music and User Metadata." Applied Sciences 9, no. 4 (February 18, 2019): 703. http://dx.doi.org/10.3390/app9040703.

Full text
Abstract:
Deep learning shows its superiority in many domains such as computing vision, nature language processing, and speech recognition. In music recommendation, most deep learning-based methods focus on learning users’ temporal preferences using their listening histories. The cold start problem is not addressed, however, and the music characteristics are not fully exploited by these methods. In addition, the music characteristics and the users’ temporal preferences are not combined naturally, which cause the relatively low performance of music recommendation. To address these issues, we proposed a Deep Temporal Neural Music Recommendation model (DTNMR) based on music characteristics and the users’ temporal preferences. We encoded the music metadata into one-hot vectors and utilized the Deep Neural Network to project the music vectors to low-dimensional space and obtain the music characteristics. In addition, Long Short-Term Memory (LSTM) neural networks are utilized to learn about users’ long-term and short-term preferences from their listening histories. DTNMR alleviates the cold start problem in the item side using the music medadata and discovers new users’ preferences immediately after they listen to music. The experimental results show DTNMR outperforms seven baseline methods in terms of recall, precision, f-measure, MAP, user coverage and AUC.
APA, Harvard, Vancouver, ISO, and other styles
48

Rodríguez-Blanco, Arturo, Manuel L. Lemos, and Carlos R. Osorio. "Integrating Conjugative Elements as Vectors of Antibiotic, Mercury, and Quaternary Ammonium Compound Resistance in Marine Aquaculture Environments." Antimicrobial Agents and Chemotherapy 56, no. 5 (February 6, 2012): 2619–26. http://dx.doi.org/10.1128/aac.05997-11.

Full text
Abstract:
ABSTRACTThe presence of SXT/R391-related integrating conjugative elements (ICEs) in bacterial strains isolated from fish obtained from marine aquaculture environments in 2001 to 2010 in the northwestern Iberian Peninsula was studied. ICEs were detected in 12 strains taxonomically related toVibrio scophthalmi(3 strains),Vibrio splendidus(5 strains),Vibrio alginolyticus(1 strain),Shewanella haliotis(1 strain), andEnterovibrio nigricans(2 strains), broadening the known host range able to harbor SXT/R391-like ICEs. Variable DNA regions, which confer element-specific properties to ICEs of this family, were characterized. One of the ICEs encoded antibiotic resistance functions in variable region III, consisting of a tetracycline resistance locus. Interestingly, hot spot 4 included genes providing resistance to rifampin (ICEVspPor2 and ICEValPor1) and quaternary ammonium compounds (QACs) (ICEEniSpa1), and variable region IV included a mercury resistance operon (ICEVspSpa1 and ICEEniSpa1). The S exclusion group was more represented than the R exclusion group, accounting for two-thirds of the total ICEs. Mating experiments allowed ICE mobilization toEscherichia colistrains, showing the corresponding transconjugants' rifampin, mercury, and QAC resistance. These results show the first evidence of ICEs providing rifampin and QAC resistances, suggesting that these mobile genetic elements contribute to the dissemination of antimicrobial, heavy metal, and QAC resistance determinants in aquaculture environments.
APA, Harvard, Vancouver, ISO, and other styles
49

Cayô, Rodrigo, María-Cruz Rodríguez, Paula Espinal, Felipe Fernández-Cuenca, Alain A. Ocampo-Sosa, Álvaro Pascual, Juan A. Ayala, Jordi Vila, and Luis Martínez-Martínez. "Analysis of Genes Encoding Penicillin-Binding Proteins in Clinical Isolates of Acinetobacter baumannii." Antimicrobial Agents and Chemotherapy 55, no. 12 (September 26, 2011): 5907–13. http://dx.doi.org/10.1128/aac.00459-11.

Full text
Abstract:
ABSTRACTThere is limited information on the role of penicillin-binding proteins (PBPs) in the resistance ofAcinetobacter baumanniito β-lactams. This study presents an analysis of the allelic variations of PBP genes inA. baumanniiisolates. Twenty-sixA. baumanniiclinical isolates (susceptible or resistant to carbapenems) from three teaching hospitals in Spain were included. The antimicrobial susceptibility profile, clonal pattern, and genomic species identification were also evaluated. Based on the six complete genomes ofA. baumannii, the PBP genes were identified, and primers were designed for each gene. The nucleotide sequences of the genes identified that encode PBPs and the corresponding amino acid sequences were compared with those of ATCC 17978. Seven PBP genes and one monofunctional transglycosylase (MGT) gene were identified in the six genomes, encoding (i) four high-molecular-mass proteins (two of class A, PBP1a [ponA] and PBP1b [mrcB], and two of class B, PBP2 [pbpAormrdA] and PBP3 [ftsI]), (ii) three low-molecular-mass proteins (two of type 5, PBP5/6 [dacC] and PBP6b [dacD], and one of type 7 (PBP7/8 [pbpG]), and (iii) a monofunctional enzyme (MtgA [mtgA]). Hot spot mutation regions were observed, although most of the allelic changes found translated into silent mutations. The amino acid consensus sequences corresponding to the PBP genes in the genomes and the clinical isolates were highly conserved. The changes found in amino acid sequences were associated with concrete clonal patterns but were not directly related to susceptibility or resistance to β-lactams. An insertion sequence disrupting the gene encoding PBP6b was identified in an endemic carbapenem-resistant clone in one of the participant hospitals.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhou, Yiting, Tingfang Wu, Yelu Jiang, Yan Li, Kailong Li, Lijun Quan, and Qiang Lyu. "DeepNup: Prediction of Nucleosome Positioning from DNA Sequences Using Deep Neural Network." Genes 13, no. 11 (October 30, 2022): 1983. http://dx.doi.org/10.3390/genes13111983.

Full text
Abstract:
Nucleosome positioning is involved in diverse cellular biological processes by regulating the accessibility of DNA sequences to DNA-binding proteins and plays a vital role. Previous studies have manifested that the intrinsic preference of nucleosomes for DNA sequences may play a dominant role in nucleosome positioning. As a consequence, it is nontrivial to develop computational methods only based on DNA sequence information to accurately identify nucleosome positioning, and thus intend to verify the contribution of DNA sequences responsible for nucleosome positioning. In this work, we propose a new deep learning-based method, named DeepNup, which enables us to improve the prediction of nucleosome positioning only from DNA sequences. Specifically, we first use a hybrid feature encoding scheme that combines One-hot encoding and Trinucleotide composition encoding to encode raw DNA sequences; afterwards, we employ multiscale convolutional neural network modules that consist of two parallel convolution kernels with different sizes and gated recurrent units to effectively learn the local and global correlation feature representations; lastly, we use a fully connected layer and a sigmoid unit serving as a classifier to integrate these learned high-order feature representations and generate the final prediction outcomes. By comparing the experimental evaluation metrics on two benchmark nucleosome positioning datasets, DeepNup achieves a better performance for nucleosome positioning prediction than that of several state-of-the-art methods. These results demonstrate that DeepNup is a powerful deep learning-based tool that enables one to accurately identify potential nucleosome sequences.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography