MTA-SZTE Research Group on Artificial Intelligence
facilitySzeged, Hungary
Research output, citation impact, and the most-cited recent papers from MTA-SZTE Research Group on Artificial Intelligence (Hungary). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from MTA-SZTE Research Group on Artificial Intelligence
We present a full result for the 2+1 flavor QCD equation of state. All the systematics are controlled, the quark masses are set to their physical values, and the continuum extrapolation is carried out. This extends our previous studies (Aoki et al., 2006 [18]; Borsanyi et al., 2010 [14]) to even finer lattices and now includes ensembles with Nt=6, 8, 10, 12 up to Nt=16. We use a Symanzik improved gauge and a stout-link improved staggered fermion action. Our findings confirm our earlier results. In order to facilitate the direct use of our equation of state we make our tabulated results available for download [33].
BACKGROUND: Detecting uncertain and negative assertions is essential in most BioMedical Text Mining tasks where, in general, the aim is to derive factual knowledge from textual data. This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus). RESULTS: The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negative and speculative keywords and at the sentence level for their linguistic scope. The annotation process was carried out by two independent linguist annotators and a chief linguist--also responsible for setting up the annotation guidelines --who resolved cases where the annotators disagreed. The resulting corpus consists of more than 20.000 sentences that were considered for annotation and over 10% of them actually contain one (or more) linguistic annotation suggesting negation or uncertainty. CONCLUSION: Statistics are reported on corpus size, ambiguity levels and the consistency of annotations. The corpus is accessible for academic purposes and is free of charge. Apart from the intended goal of serving as a common resource for the training, testing and comparing of biomedical Natural Language Processing systems, the corpus is also a good resource for the linguistic analysis of scientific and clinical texts.
Mushroom-forming fungi (Agaricomycetes) have the greatest morphological diversity and complexity of any group of fungi. They have radiated into most niches and fulfil diverse roles in the ecosystem, including wood decomposers, pathogens or mycorrhizal mutualists. Despite the importance of mushroom-forming fungi, large-scale patterns of their evolutionary history are poorly known, in part due to the lack of a comprehensive and dated molecular phylogeny. Here, using multigene and genome-based data, we assemble a 5,284-species phylogenetic tree and infer ages and broad patterns of speciation/extinction and morphological innovation in mushroom-forming fungi. Agaricomycetes started a rapid class-wide radiation in the Jurassic, coinciding with the spread of (sub)tropical coniferous forests and a warming climate. A possible mass extinction, several clade-specific adaptive radiations and morphological diversification of fruiting bodies followed during the Cretaceous and the Paleogene, convergently giving rise to the classic toadstool morphology, with a cap, stalk and gills (pileate-stipitate morphology). This morphology is associated with increased rates of lineage diversification, suggesting it represents a key innovation in the evolution of mushroom-forming fungi. The increase in mushroom diversity started during the Mesozoic-Cenozoic radiation event, an era of humid climate when terrestrial communities dominated by gymnosperms and reptiles were also expanding.
BACKGROUND: Even today the reliable diagnosis of the prodromal stages of Alzheimer's disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive decline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. METHODS: Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech signals, first manually (using the Praat software), and then automatically, with an automatic speech recognition (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. RESULTS: The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process - that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. CONCLUSION: The temporal analysis of spontaneous speech can be exploited in implementing a new, automatic detection-based tool for screening MCI for the community.
Understanding how evolution of antimicrobial resistance increases resistance to other drugs is a challenge of profound importance. By combining experimental evolution and genome sequencing of 63 laboratory-evolved lines, we charted a map of cross-resistance interactions between antibiotics in Escherichia coli, and explored the driving evolutionary principles. Here, we show that (1) convergent molecular evolution is prevalent across antibiotic treatments, (2) resistance conferring mutations simultaneously enhance sensitivity to many other drugs and (3) 27% of the accumulated mutations generate proteins with compromised activities, suggesting that antibiotic adaptation can partly be achieved without gain of novel function. By using knowledge on antibiotic properties, we examined the determinants of cross-resistance and identified chemogenomic profile similarity between antibiotics as the strongest predictor. In contrast, cross-resistance between two antibiotics is independent of whether they show synergistic effects in combination. These results have important implications on the development of novel antimicrobial strategies.
PURPOSE: The biomedical applications of silver nanoparticles (AgNPs) are heavily investigated due to their cytotoxic and antimicrobial properties. However, the scientific literature is lacking in data on the aggregation behavior of nanoparticles, especially regarding its impact on biological activity. Therefore, to assess the potential of AgNPs in therapeutic applications, two different AgNP samples were compared under biorelevant conditions. METHODS: Citrate-capped nanosilver was produced by classical chemical reduction and stabilization with sodium citrate (AgNP@C), while green tea extract was used to produce silver nanoparticles in a green synthesis approach (AgNP@GTs). Particle size, morphology, and crystallinity were characterized using transmission electron microscopy. To observe the effects of the most important biorelevant conditions on AgNP colloidal stability, aggregation grade measurements were carried out using UV-Vis spectroscopy and dynamic light scatterig, while MTT assay and a microdilution method were performed to evaluate the effects of aggregation on cytotoxicity and antimicrobial activity in a time-dependent manner. RESULTS: The aggregation behavior of AgNPs is mostly affected by pH and electrolyte concentration, while the presence of biomolecules can improve particle stability due to the biomolecular corona effect. We demonstrated that high aggregation grade in both AgNP samples attenuated their toxic effect toward living cells. However, AgNP@GT proved less prone to aggregation thus retained a degree of its toxicity. CONCLUSION: To our knowledge, this is the first systematic examination regarding AgNP aggregation behavior with simultaneous measurements of its effect on biological activity. We showed that nanoparticle behavior in complex systems can be estimated by simple compounds like sodium chloride and glutamine. Electrostatic stabilization might not be suitable for biomedical AgNP applications, while green synthesis approaches could offer new frontiers to preserve nanoparticle toxicity by enhancing colloidal stability. The importance of properly selected synthesis methods must be emphasized as they profoundly influence colloidal stability, and therefore biological activity.
Abstract The eighteenth data release (DR18) of the Sloan Digital Sky Survey (SDSS) is the first one for SDSS-V, the fifth generation of the survey. SDSS-V comprises three primary scientific programs or “Mappers”: the Milky Way Mapper (MWM), the Black Hole Mapper (BHM), and the Local Volume Mapper. This data release contains extensive targeting information for the two multiobject spectroscopy programs (MWM and BHM), including input catalogs and selection functions for their numerous scientific objectives. We describe the production of the targeting databases and their calibration and scientifically focused components. DR18 also includes ∼25,000 new SDSS spectra and supplemental information for X-ray sources identified by eROSITA in its eFEDS field. We present updates to some of the SDSS software pipelines and preview changes anticipated for DR19. We also describe three value-added catalogs (VACs) based on SDSS-IV data that have been published since DR17, and one VAC based on the SDSS-V data in the eFEDS field.
Efficient generation of THz pulses with high energy was demonstrated by optical rectification of 785-fs laser pulses in lithium niobate using tilted-pulse-front pumping. The enhancement of conversion efficiency by a factor of 2.4 to 2.7 was demonstrated up to 186 μJ THz energy by cryogenic cooling of the generating crystal and using up to 18.5 mJ/cm2 pump fluence. Generation of THz pulses with more than 0.4 mJ energy and 0.77% efficiency was demonstrated even at room temperature by increasing the pump fluence to 186 mJ/cm2. The spectral peak is at about 0.2 THz, suitable for charged-particle manipulation.
It is known that Alzheimer's disease (AD) influences the temporal characteristics of spontaneous speech. These phonetical changes are present even in mild AD. Based on this, the question arises whether an examination based on language analysis could help the early diagnosis of AD and if so, which language and speech characteristics can identify AD in its early stage. The purpose of this article is to summarize the relation between prodromal and manifest AD and language functions and language domains. Based on our research, we are inclined to claim that AD can be more sensitively detected with the help of a linguistic analysis than with other cognitive examinations. The temporal characteristics of spontaneous speech, such as speech tempo, number of pauses in speech, and their length are sensitive detectors of the early stage of the disease, which enables an early simple linguistic screening for AD. However, knowledge about the unique features of the language problems associated with different dementia variants still has to be improved and refined.
BACKGROUND: ) in order to describe the "fall" rate in serum. MATERIALS AND METHODS: Through searches on EMBASE, Medline, and Scopus, we looked for articles where these proteins had been serially sampled in serum in human TBI. We excluded animal studies, studies with only one presented sample and studies without neuroradiological examinations. RESULTS: = 2) only increased in the few studies available, suggesting a serum availability of >10 days. To date, automated assays are available for S100B and NSE making them faster and more practical to use. CONCLUSION: Serial sampling of brain-specific proteins in serum reveals different temporal trajectories that should be acknowledged. Proteins with shorter serum availability, like S100B, may be superior to proteins such as NF-L in detection of secondary harmful events when monitoring patients with TBI.
Alzheimer's disease (AD) and Parkinson's disease (PD) are the most common neurodegenerative diseases (NDs), presenting a broad range of symptoms from motor dysfunctions to psychobehavioral manifestations. A common clinical course is the proteinopathy-induced neural dysfunction leading to anatomically corresponding neuropathies. However, current diagnostic criteria based on pathology and symptomatology are of little value for the sake of disease prevention and drug development. Overviewing the pathomechanism of NDs, this review incorporates systematic reviews on inflammatory cytokines and tryptophan metabolites kynurenines (KYNs) of human samples, to present an inferential method to explore potential links behind NDs. The results revealed increases of pro-inflammatory cytokines and neurotoxic KYNs in NDs, increases of anti-inflammatory cytokines in AD, PD, Huntington's disease (HD), Creutzfeldt-Jakob disease, and human immunodeficiency virus (HIV)-associated neurocognitive disorders, and decreases of neuromodulatory KYNs in AD, PD, and HD. The results reinforced a strong link between inflammation and neurotoxic KYNs, confirmed activation of adaptive immune response, and suggested a possible role in the decrease of neuromodulatory KYNs, all of which may contribute to the development of chronic low grade inflammation. Commonalities of multifactorial NDs were discussed to present a current limit of diagnostic criteria, a need for preclinical biomarkers, and an approach to search the initiation factors of NDs.
The tryptophan (TRP)-kynurenine (KYN) metabolic pathway is a main player of TRP metabolism through which more than 95% of TRP is catabolized. The pathway is activated by acute and chronic immune responses leading to a wide range of illnesses including cancer, immune diseases, neurodegenerative diseases and psychiatric disorders. The presence of positive feedback loops facilitates amplifying the immune responses vice versa. The TRP-KYN pathway synthesizes multifarious metabolites including oxidants, antioxidants, neurotoxins, neuroprotectants and immunomodulators. The immunomodulators are known to facilitate the immune system towards a tolerogenic state, resulting in chronic low-grade inflammation (LGI) that is commonly present in obesity, poor nutrition, exposer to chemicals or allergens, prodromal stage of various illnesses and chronic diseases. KYN, kynurenic acid, xanthurenic acid and cinnabarinic acid are aryl hydrocarbon receptor ligands that serve as immunomodulators. Furthermore, TRP-KYN pathway enzymes are known to be activated by the stress hormone cortisol and inflammatory cytokines, and genotypic variants were observed to contribute to inflammation and thus various diseases. The tryptophan 2,3-dioxygenase, the indoleamine 2,3-dioxygenases and the kynurenine-3-monooxygenase are main enzymes in the pathway. This review article discusses the TRP-KYN pathway with special emphasis on its interaction with the immune system and the tolerogenic shift towards chronic LGI and overviews the major symptoms, pro- and anti-inflammatory cytokines and toxic and protective KYNs to explore the linkage between chronic LGI, KYNs, and major psychiatric disorders, including depressive disorder, bipolar disorder, substance use disorder, post-traumatic stress disorder, schizophrenia and autism spectrum disorder.
In recent years, the debate has continued among researchers and instructors regarding the influence of Problem-Based Learning (PBL) on the effectiveness of instructional intervention for Critical Thinking (CT) in higher education. This study, conducting a meta-analysis by synthesizing 50 relevant empirical studies from 2000 to 2021 with 5,210 participants and 58 effect sizes, aims to present potential factors (i.e., sample size, sample type, instruction type, gender, maturity, instrument, nationality, discipline, treatment duration, and group size) that may influence the effectiveness of the cultivation of CT skills and disposition on the basis of PBL. No evident publication bias was found (Egger's bias = 1.21, p > 0.05). From the general perspective, the results demonstrate the high level of influence of PBL (Standardized Mean Difference [SMD] = 0.640, p < 0.001) on CT with heterogeneity (I2 = 82.9%) due to the adopted instruments, mixed methods, and target outcomes, and no difference was observed between influence on CT skills and disposition. Students' maturity, nationality, sample type, instruction type, and group size are influencing factors of overall CT. The effects of intervention for seniors, western students, randomized samples, online instruction, and groups with less than six members are better, whereas short-term intervention is ineffective. For CT skills, the treatments for juniors and groups with less than six members are ineffective, and sample type and instruction type are not influencing factors. However, the effect sizes of big sample sizes, seniors, other kinds of instruments, western students, Sciences as a discipline, more than ten members in a group, and long-term intervention are stronger. For CT disposition, sample type, instruction type, discipline, and intervention duration are influencing factors, in which randomized samples, online instruction, students in Medicine, and medium-term intervention exerted a stronger effect than the other factors. In conclusion, although PBL is overall effective for promoting the acquisition of CT (skills and disposition), additional studies are also required to explore the effectiveness and influencing factors in other contexts, such as various learning or teaching strategies, environments, and scaffoldings, and scenario-problem-based tasks instead of only curriculum-based ones. These factors should also be considered to promote CT skills and disposition among undergraduates.
Neuropathic pain is a chronic secondary pain condition, which is a consequence of peripheral or central nervous (somatosensory) system lesions or diseases. It is a devastating condition, which affects around 7% of the general population. Numerous etiological factors contribute to the development of chronic neuropathic pain. It can originate from the peripheral part of the nervous system such as in the case of trigeminal or postherpetic neuralgia, peripheral nerve injury, painful polyneuropathies, or radiculopathies. Central chronic neuropathic pain can develop as a result of spinal cord or brain injury, stroke, or multiple sclerosis. As first-line pharmacological treatment options, tricyclic antidepressants, serotonin-norepinephrine reuptake inhibitors, and gabapentinoids are recommended. In trigeminal neuralgia, carbamazepine and oxcarbazepine are the first-choice drugs. In drug-refractory cases, interventional, physical, and psychological therapies are available. This review was structured based on a PubMed search of papers published in the field from 2010 until May 2019.
Federated learning is a distributed machine learning approach for computing models over data collected by edge devices. Most importantly, the data itself is not collected centrally, but a master-worker architecture is applied where a master node performs aggregation and the edge devices are the workers, not unlike the parameter server approach. Gossip learning also assumes that the data remains at the edge devices, but it requires no aggregation server or any central component. In this empirical study, we present a thorough comparison of the two approaches. We examine the aggregated cost of machine learning in both cases, considering also a compression technique applicable in both approaches. We apply a real churn trace as well collected over mobile phones, and we also experiment with different distributions of the training data over the devices. Surprisingly, gossip learning actually outperforms federated learning in all the scenarios where the training data are distributed uniformly over the nodes, and it performs comparably to federated learning overall.
In this review, the potential causes of ageing are discussed. We seek to gain insight into the main physiological functions of mitochondria and discuss alterations in their function and the genome, which are supposed to be the central mechanisms in senescence. We conclude by presenting the potential modulating role of the kynurenine pathway in the ageing processes. Mitochondrial dynamics are supposed to have important physiological roles in maintaining cell homeostasis. During ageing, a decrease in mitochondrial dynamics was reported, potentially compromising the function of mitochondria. Mitochondrial biogenesis not only encompasses mitochondrial dynamics, but also the regulation of transcription and translation of genes, and mitochondria are supposed to play a prominent role in cell death during senescence. Defects in the mtDNA replication machinery and failure in the repair of mtDNA might result in the accumulation of mutations, leading to mitochondrial dysfunction and bioenergetic failure of the cell. The role of reactive oxygen species (ROS) in the ageing processes is widely acknowledged. Exaggerated oxidative damage to mDNA is supposed to take place during senescence, including single-nucleotide base alterations, nucleotide base pair alterations, chain breaks and cross linkage. A broad repertoire for the repair of DNA faults has evolved, but they do not function efficiently during senescence. Poly (ADP-ribose) polymerase (PARP) is an enzyme that assists in DNA repair, i.e., it participates in the repair of single-stranded DNA nicks, initiating base excision repair (BER). In the case of extensive DNA damage, PARP-1 becomes overactivated and rapidly depletes the intracellular NAD+ and ATP pools. This results in a profound energy loss of the cell and leads to cell dysfunction, or even cell death. Alterations in the kynurenine system have been linked with ageing processes and several age-related disorders. The kynurenine pathway degrades tryptophan (TRP) to several metabolites, among others kynurenine (KYN), kynurenic acid (KYNA) and quinolinic acid (QUIN). The end product of the route is NAD+. The first metabolic reaction is mediated by TRP-2,3-dioxygenase (TDO) or indolamine-2,3-dioxygenases (IDO), the latter being induced by inflammation, and it is thought to have a significant role in several disorders and in ageing. Research is currently focusing on the KYN pathway, since several intermediates possess neuro- and immunoactive properties, and hence are capable of modulating the activity of certain brain cells and inflammatory responses. During ageing, and in many age-associated disorders like obesity, dyslipidaemia, hypertension, insulin resistance and neurodegenerative diseases, low-grade, sustained inflammation and upregulation of IDO have been reported. However, TRP downstream catabolites create a negative feedback loop by weakening the activated immune system through several actions, including a decline in the Th1 response and an enhancement of Th2-type processes. The broad actions of the KYN-intermediates in brain excitation/inhibition and their role in regulating immune responses may provide the possibility of modifying the pathological processes in an array of age-associated diseases in the future.
Background & AimsExcessive consumption of ethanol is one of the most common causes of acute and chronic pancreatitis. Alterations to the gene encoding the cystic fibrosis transmembrane conductance regulator (CFTR) also cause pancreatitis. However, little is known about the role of CFTR in the pathogenesis of alcohol-induced pancreatitis.MethodsWe measured CFTR activity based on chloride concentrations in sweat from patients with cystic fibrosis, patients admitted to the emergency department because of excessive alcohol consumption, and healthy volunteers. We measured CFTR levels and localization in pancreatic tissues and in patients with acute or chronic pancreatitis induced by alcohol. We studied the effects of ethanol, fatty acids, and fatty acid ethyl esters on secretion of pancreatic fluid and HCO3−, levels and function of CFTR, and exchange of Cl− for HCO3− in pancreatic cell lines as well as in tissues from guinea pigs and CFTR knockout mice after administration of alcohol.ResultsChloride concentrations increased in sweat samples from patients who acutely abused alcohol but not in samples from healthy volunteers, indicating that alcohol affects CFTR function. Pancreatic tissues from patients with acute or chronic pancreatitis had lower levels of CFTR than tissues from healthy volunteers. Alcohol and fatty acids inhibited secretion of fluid and HCO3−, as well as CFTR activity, in pancreatic ductal epithelial cells. These effects were mediated by sustained increases in concentrations of intracellular calcium and adenosine 3′,5′-cyclic monophosphate, depletion of adenosine triphosphate, and depolarization of mitochondrial membranes. In pancreatic cell lines and pancreatic tissues of mice and guinea pigs, administration of ethanol reduced expression of CFTR messenger RNA, reduced the stability of CFTR at the cell surface, and disrupted folding of CFTR at the endoplasmic reticulum. CFTR knockout mice given ethanol or fatty acids developed more severe pancreatitis than mice not given ethanol or fatty acids.ConclusionsBased on studies of human, mouse, and guinea pig pancreata, alcohol disrupts expression and localization of the CFTR. This appears to contribute to development of pancreatitis. Strategies to increase CFTR levels or function might be used to treat alcohol-associated pancreatitis. Excessive consumption of ethanol is one of the most common causes of acute and chronic pancreatitis. Alterations to the gene encoding the cystic fibrosis transmembrane conductance regulator (CFTR) also cause pancreatitis. However, little is known about the role of CFTR in the pathogenesis of alcohol-induced pancreatitis. We measured CFTR activity based on chloride concentrations in sweat from patients with cystic fibrosis, patients admitted to the emergency department because of excessive alcohol consumption, and healthy volunteers. We measured CFTR levels and localization in pancreatic tissues and in patients with acute or chronic pancreatitis induced by alcohol. We studied the effects of ethanol, fatty acids, and fatty acid ethyl esters on secretion of pancreatic fluid and HCO3−, levels and function of CFTR, and exchange of Cl− for HCO3− in pancreatic cell lines as well as in tissues from guinea pigs and CFTR knockout mice after administration of alcohol. Chloride concentrations increased in sweat samples from patients who acutely abused alcohol but not in samples from healthy volunteers, indicating that alcohol affects CFTR function. Pancreatic tissues from patients with acute or chronic pancreatitis had lower levels of CFTR than tissues from healthy volunteers. Alcohol and fatty acids inhibited secretion of fluid and HCO3−, as well as CFTR activity, in pancreatic ductal epithelial cells. These effects were mediated by sustained increases in concentrations of intracellular calcium and adenosine 3′,5′-cyclic monophosphate, depletion of adenosine triphosphate, and depolarization of mitochondrial membranes. In pancreatic cell lines and pancreatic tissues of mice and guinea pigs, administration of ethanol reduced expression of CFTR messenger RNA, reduced the stability of CFTR at the cell surface, and disrupted folding of CFTR at the endoplasmic reticulum. CFTR knockout mice given ethanol or fatty acids developed more severe pancreatitis than mice not given ethanol or fatty acids. Based on studies of human, mouse, and guinea pig pancreata, alcohol disrupts expression and localization of the CFTR. This appears to contribute to development of pancreatitis. Strategies to increase CFTR levels or function might be used to treat alcohol-associated pancreatitis.
BACKGROUND: Parkinson's disease (PD) is a neurodegenerative disorder with a prevalence increasing with age. Oxidative stress and glutamate toxicity are involved in its pathomechanism. There are still many unmet needs of PD patients, including the alleviation of motor fluctuations and dyskinesias, and the development of therapies with neuroprotective potential. OBJECTIVE: To give an overview of the pharmacological properties, the efficacy and safety of the monoamine oxidase B (MAO-B) inhibitors in the treatment of PD, with special focus on the results of randomized clinical trials. METHOD: A literature search was conducted in PubMed for 'PD treatment', 'MAO-B inhibitors', 'selegiline', 'rasagiline', 'safinamide' and 'clinical trials' with 'MAO-B inhibitors' in 'Parkinson' disease'. RESULTS: MAO-B inhibitors have a favorable pharmacokinetic profile, improve the dopamine deficient state and may have neuroprotective properties. Safinamide exhibits an anti-glutamatergic effect as well. When applied as monotherapy, MAO-B inhibitors provide a modest, but significant improvement of motor function and delay the need for levodopa. Rasagiline and safinamide were proven safe and effective when added to a dopamine agonist in early PD. As add-on to levodopa, MAO-B inhibitors significantly reduced off-time and were comparable in efficacy to COMT inhibitors. Improvements were achieved as regards certain non-motor symptoms as well. CONCLUSION: Due to the efficacy shown in clinical trials and their favorable side-effect profile, MAO-B inhibitors are valuable drugs in the treatment of PD. They are recommended as monotherapy in the early stages of the disease and as add-on therapy to levodopa in advanced PD.
One of the most critical issues in large-scale software development and maintenance is the rapidly growing size and complexity of software systems. As a result of this rapid growth there is a need to better understand the relationships between the different parts of a large software system. In this paper we present a reverse engineering framework called Columbus that is able to analyze large C++ projects, and a schema for C++ that prescribes the form of the extracted data. The flexible architecture of the Columbus system with a powerful C++ analyzer and schema makes it a versatile and readily extendible toolset for reverse engineering. This tool is free for scientific and educational purposes and we fervently hope that it will assist academic persons in any research work related to C++ re- and reverse engineering.
BACKGROUND: In this paper we focus on the problem of automatically constructing ICD-9-CM coding systems for radiology reports. ICD-9-CM codes are used for billing purposes by health institutes and are assigned to clinical records manually following clinical treatment. Since this labeling task requires expert knowledge in the field of medicine, the process itself is costly and is prone to errors as human annotators have to consider thousands of possible codes when assigning the right ICD-9-CM labels to a document. In this study we use the datasets made available for training and testing automated ICD-9-CM coding systems by the organisers of an International Challenge on Classifying Clinical Free Text Using Natural Language Processing in spring 2007. The challenge itself was dominated by entirely or partly rule-based systems that solve the coding task using a set of hand crafted expert rules. Since the feasibility of the construction of such systems for thousands of ICD codes is indeed questionable, we decided to examine the problem of automatically constructing similar rule sets that turned out to achieve a remarkable accuracy in the shared task challenge. RESULTS: Our results are very promising in the sense that we managed to achieve comparable results with purely hand-crafted ICD-9-CM classifiers. Our best model got a 90.26% F measure on the training dataset and an 88.93% F measure on the challenge test dataset, using the micro-averaged F beta=1 measure, the official evaluation metric of the International Challenge on Classifying Clinical Free Text Using Natural Language Processing. This result would have placed second in the challenge, with a hand-crafted system achieving slightly better results. CONCLUSIONS: Our results demonstrate that hand-crafted systems - which proved to be successful in ICD-9-CM coding - can be reproduced by replacing several laborious steps in their construction with machine learning models. These hybrid systems preserve the favourable aspects of rule-based classifiers like good performance, and their development can be achieved rapidly and requires less human effort. Hence the construction of such hybrid systems can be feasible for a set of labels one magnitude bigger, and with more labeled data.