NobleBlocks

Samsung (Brazil)

companySão Paulo, Brazil

Research output, citation impact, and the most-cited recent papers from Samsung (Brazil) (Brazil). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
220
Citations
4.2K
h-index
25
i10-index
107
Also known as
Samsung (Brazil)

Top-cited papers from Samsung (Brazil)

Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?
Otávio A. B. Penatti, Keiller Nogueira, Jefersson A. dos Santos
2015714doi:10.1109/cvprw.2015.7301382

In this paper, we evaluate the generalization power of deep features (ConvNets) in two new scenarios: aerial and remote sensing image classification. We evaluate experimentally ConvNets trained for recognizing everyday objects for the classification of aerial and remote sensing images. ConvNets obtained the best results for aerial images, while for remote sensing, they performed well but were outperformed by low-level color descriptors, such as BIC. We also present a correlation analysis, showing the potential for combining/fusing different ConvNets with other descriptors or even for combining multiple ConvNets. A preliminary set of experiments fusing ConvNets obtains state-of-the-art results for the well-known UCMerced dataset.

Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition
Diogo Luvizon, David Picard, Hedi Tabia
2020· IEEE Transactions on Pattern Analysis and Machine Intelligence148doi:10.1109/tpami.2020.2976014

Human pose estimation and action recognition are related tasks since both problems are strongly dependent on the human body representation and analysis. Nonetheless, most recent methods in the literature handle the two problems separately. In this article, we propose a multi-task framework for jointly estimating 2D or 3D human poses from monocular color images and classifying human actions from video sequences. We show that a single architecture can be used to solve both problems in an efficient way and still achieves state-of-the-art or comparable results at each task while running with a throughput of more than 100 frames per second. The proposed method benefits from high parameters sharing between the two tasks by unifying still images and video clips processing in a single pipeline, allowing the model to be trained with data from different categories simultaneously and in a seamlessly way. Additionally, we provide important insights for end-to-end training the proposed multi-task model by decoupling key prediction parts, which consistently leads to better accuracy on both tasks. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU RGB+D) demonstrate the effectiveness of our method on the targeted tasks. Our source code and trained weights are publicly available at https://github.com/dluvizon/deephar.

Exploiting ConvNet Diversity for Flooding Identification
Keiller Nogueira, Samuel G. Fadel, Ícaro Cavalcante Dourado, Rafael de Oliveira Werneck +4 more
2018· IEEE Geoscience and Remote Sensing Letters91doi:10.1109/lgrs.2018.2845549

Flooding is the world's most costly type of natural disaster in terms of both economic losses and human causalities. A first and essential procedure toward flood monitoring is based on identifying the area most vulnerable to flooding, which gives authorities relevant regions to focus. In this letter, we propose several methods to perform flooding identification in high-resolution remote sensing images using deep learning. Specifically, some proposed techniques are based upon unique networks, such as dilated and deconvolutional ones, whereas others were conceived to exploit diversity of distinct networks in order to extract the maximum performance of each classifier. The evaluation of the proposed methods was conducted in a high-resolution remote sensing data set. Results show that the proposed algorithms outperformed the state-of-the-art baselines, providing improvements ranging from 1% to 4% in terms of the Jaccard Index.

A 4D DCT-Based Lenslet Light Field Codec
Murilo B. de Carvalho, Marcio P. Pereira, Gustavo Alves, Eduardo A. B. da Silva +3 more
201856doi:10.1109/icip.2018.8451684

Light fields aim to represent visual information in 3D space. They are 4D structures that contain the images of a given scene from a sampled 2D range of viewpoints. When acquired using a lenslet camera, in addition to the ordinary intra-view redundancy, these views have a great deal of inter-view redundancy. In this work we propose a light field codec that fully exploits the 4D redundancy of light fields by using a 4D transform and hexadeca-trees. It initially divides the light field into 4D blocks and computes a 4D Discrete Cosine Transform of each one. Then the transform coefficients of the 4D block are grouped using hexadeca-trees on a bitplane-by-bitplane basis, and the generated stream is encoded using an adaptive arithmetic coder. The proposed codec has been employed to encode the JPEG Pleno lenslet light fields. The rate-distortion results have been assessed using test conditions comparable to the ones presented at the ICIP 2017 Light Field Coding Grand Challenge. The proposed codec, despite being conceptually simple, achieves competitive rate-distortion performance.

Applying user-centered techniques to analyze and design a mobile application
Adriana Lopes, Natasha Valentim, Bruna de Oliveira Moraes, Renata Zilse +1 more
2018· Journal of Software Engineering Research and Development44doi:10.1186/s40411-018-0049-1

Techniques that help in understanding and designing user needs are increasingly being used in Software Engineering to improve the acceptance of applications. Among these techniques we can cite personas, scenarios and interaction models. Personas are fictitious representations of target users. Scenarios provide various types of information at different levels of abstraction. Interaction models help in design of an adequate user interaction with the system. This paper presents a research that reports a set of practical activities applied by a software team using techniques in the analysis and design phases of a mobile application. In the analysis phase, we created personas and scenarios for the extraction of requirements. In the design phase, we created interaction models for describes the behavior between user and system during the interaction. We employed these interaction models to develop other artifacts, such as prototypes. In addition, we presented a technique developed by the analysis and design team for the inspection of interaction models. This technique reduced the spread of defects in the interaction models. From the results of this research, we suggest: (i) employing personas and scenarios to understand the requirements; (ii) employing interaction models to understand the behavior between user and system; and (iii) using interaction models as basis to develop other artifacts. Through the reporting of this set of practical activities, we hope to provide support for software engineers willing to adopt techniques that support the analysis and design of applications aiming at better quality of use for their users.

Data, Depth, and Design: Learning Reliable Models for Skin Lesion\n Analysis
Eduardo Valle, Michel Fornaciali, Afonso Menegola, Julia Tavares +3 more
2017· arXiv (Cornell University)42doi:10.48550/arxiv.1711.00441

Deep learning fostered a leap ahead in automated skin lesion analysis in the\nlast two years. Those models are expensive to train and difficult to\nparameterize. Objective: We investigate methodological issues for designing and\nevaluating deep learning models for skin lesion analysis. We explore 10 choices\nfaced by researchers: use of transfer learning, model architecture, train\ndataset, image resolution, type of data augmentation, input normalization, use\nof segmentation, duration of training, additional use of SVMs, and test data\naugmentation. Methods: We perform two full factorial experiments, for five\ndifferent test datasets, resulting in 2560 exhaustive trials in our main\nexperiment, and 1280 trials in our assessment of transfer learning. We analyze\nboth with multi-way ANOVA. We use the exhaustive trials to simulate sequential\ndecisions and ensembles, with and without the use of privileged information\nfrom the test set. Results -- main experiment: Amount of train data has\ndisproportionate influence, explaining almost half the variation in\nperformance. Of the other factors, test data augmentation and input resolution\nare the most influential. Deeper models, when combined, with extra data, also\nhelp. -- transfer experiment: Transfer learning is critical, its absence brings\nhuge performance penalties. -- simulations: Ensembles of models are the best\noption to provide reliable results with limited resources, without using\nprivileged information and sacrificing methodological rigor. Conclusions and\nSignificance: Advancing research on automated skin lesion analysis requires\ncurating larger public datasets. Indirect use of privileged information from\nthe test set to design the models is a subtle, but frequent methodological\nmistake that leads to overoptimistic results. Ensembles of models are a\ncost-effective alternative to the expensive full-factorial and to the unstable\nsequential designs.\n

Pixel-Level Tissue Classification for Ultrasound Images
Daniel V. Pazinato, Bernardo V. Stein, Waldir R. De Almeida, Rafael de Oliveira Werneck +4 more
2014· IEEE Journal of Biomedical and Health Informatics38doi:10.1109/jbhi.2014.2386796

BACKGROUND: Pixel-level tissue classification for ultrasound images, commonly applied to carotid images, is usually based on defining thresholds for the isolated pixel values. Ranges of pixel values are defined for the classification of each tissue. The classification of pixels is then used to determine the carotid plaque composition and, consequently, to determine the risk of diseases (e.g., strokes) and whether or not a surgery is necessary. The use of threshold-based methods dates from the early 2000s but it is still widely used for virtual histology. METHODOLOGY/PRINCIPAL FINDINGS: We propose the use of descriptors that take into account information about a neighborhood of a pixel when classifying it. We evaluated experimentally different descriptors (statistical moments, texture-based, gradient-based, local binary patterns, etc.) on a dataset of five types of tissues: blood, lipids, muscle, fibrous, and calcium. The pipeline of the proposed classification method is based on image normalization, multiscale feature extraction, including the proposal of a new descriptor, and machine learning classification. We have also analyzed the correlation between the proposed pixel classification method in the ultrasound images and the real histology with the aid of medical specialists. CONCLUSIONS/SIGNIFICANCE: The classification accuracy obtained by the proposed method with the novel descriptor in the ultrasound tissue images (around 73%) is significantly above the accuracy of the state-of-the-art threshold-based methods (around 54%). The results are validated by statistical tests. The correlation between the virtual and real histology confirms the quality of the proposed approach showing it is a robust ally for the virtual histology in ultrasound images.

Detection of Fragmented Rectangular Enclosures in Very High Resolution Remote Sensing Images
Igor Zingman, Dietmar Saupe, Otávio A. B. Penatti, Karsten Lambers
2016· IEEE Transactions on Geoscience and Remote Sensing35doi:10.1109/tgrs.2016.2545919

We develop an approach for the detection of ruins of livestock enclosures (LEs) in alpine areas captured by high-resolution remotely sensed images. These structures are usually of approximately rectangular shape and appear in images as faint fragmented contours in complex background. We address this problem by introducing a rectangularity feature that quantifies the degree of alignment of an optimal subset of extracted linear segments with a contour of rectangular shape. The rectangularity feature has high values not only for perfectly regular enclosures but also for ruined ones with distorted angles, fragmented walls, or even a completely missing wall. Furthermore, it has a zero value for spurious structures with less than three sides of a perceivable rectangle. We show how the detection performance can be improved by learning a linear combination of the rectangularity and size features from just a few available representative examples and a large number of negatives. Our approach allowed detection of enclosures in the Silvretta Alps that were previously unknown. A comparative performance analysis is provided. Among other features, our comparison includes the state-of-the-art features that were generated by pretrained deep convolutional neural networks (CNNs). The deep CNN features, although learned from a very different type of images, provided the basic ability to capture the visual concept of the LEs. However, our handcrafted rectangularity-size features showed considerably higher performance.

A survey on data analysis on large-Scale wireless networks: online stream processing, trends, and challenges
Dianne S. V. Medeiros, Helio N. Cunha Neto, Martin Andreoni Lopez, Luiz Magalhães +4 more
2020· Journal of Internet Services and Applications31doi:10.1186/s13174-020-00127-2

Abstract In this paper we focus on knowledge extraction from large-scale wireless networks through stream processing. We present the primary methods for sampling, data collection, and monitoring of wireless networks and we characterize knowledge extraction as a machine learning problem on big data stream processing. We show the main trends in big data stream processing frameworks. Additionally, we explore the data preprocessing, feature engineering, and the machine learning algorithms applied to the scenario of wireless network analytics. We address challenges and present research projects in wireless network monitoring and stream processing. Finally, future perspectives, such as deep learning and reinforcement learning in stream processing, are anticipated.

The JPEG Pleno Light Field Coding Standard 4D-Transform Mode: How to Design an Efficient 4D-Native Codec
Gustavo Alves, Murilo B. de Carvalho, Carla L. Pagliari, Pedro Garcia Freitas +4 more
2020· IEEE Access28doi:10.1109/access.2020.3024844

The increasing demand for highly realistic and immersive visual experiences has led to the emergence of richer 3D visual representation models such as light fields, point clouds and meshes. Light fields may be modelled as a 2D array of 2D views, corresponding to a large amount of data, which demands for highly efficient coding solutions. Although static light fields are inherently 4D structures, the coding solutions in the literature mostly employ traditional 2D coding tools associated with techniques such as depth-based image rendering to generate a residual, again to be coded using available 2D coding tools. To address the market needs, JPEG has launched the so-called JPEG Pleno standard, which Part 2 is dedicated to light field coding. The JPEG Pleno Light Field Coding standard includes two coding modes, one based on the 4D-DCT, so-called 4D-Transform mode, and another based on depth-based synthesis, so-called 4D-Prediction. The 4D-Transform coding mode standardizes for the first time a 4D-native light field coding solution where the full light field redundancy across the four dimensions is comprehensively exploited, somehow extending to 4D the 2D coding framework adopted decades ago by the popular JPEG Baseline standard. In this context, this paper describes and analyzes in detail the conceptual and algorithmic design process which has led to the creation of the JPEG Pleno Light Field Coding standard 4D-Transform coding mode. This has happened through a sequence of steps involving technical innovation design and integration where increasingly sophisticated coding tools have been combined and improved to maximize the final rate distortion (RD) performance.

HADS: Hybrid Anomaly Detection System for IoT Environments
Parth Bhatt, Anderson Morais
201825doi:10.1109/iintec.2018.8695303

IoT (Internet of Things) devices are rapidly becoming popular in residential environments, but security is still a big concern in this ecosystem. The fast growth of IoT devices in homes and new attacks targeting these devices require a smart detection solution to protect this heterogeneous environment. In this paper, we present an attack detection approach based on machine learning techniques for anomaly detection, and a decision module, with the goal of identifying relevant attacks on IoT network. The approach is implemented on a single-board computer and systematically evaluated using various protocol attacks and commercial off-the-shelf IoT devices to verify its effectiveness and feasibility in a realistic scenario. The results obtained in the experimental evaluation indicate that our proposed approach can be applied to protect IoT devices against the considered attacks with accuracy of 94%-99% and detection time less than 0.7s.

Temporal Robust Features for Violence Detection
Daniel Moreira, Sandra Avila, Mauricio Pérez, Daniel Moraes +4 more
201724doi:10.1109/wacv.2017.50

Automatically detecting violence in videos is paramount for enforcing the law and providing the society with better policies for safer public places. In addition, it may be essential for protecting minors from accessing inappropriate contents on-line, and for helping parents choose suitable movie titles for their children. However, this is an open problem as the very definition of violence is subjective and may vary from one society to another. Detecting such nuances from video footages with no human supervision is very challenging. Clearly, when designing a computer-aided solution to this problem, we need to think of efficient (quickly harness large troves of data) and effective detection methods (robustly filter what needs special attention and further analysis). In this vein, we explore a content description method for violence detection founded upon temporal robust features that quickly grasp video sequences, automatically classifying violent videos. The used method also holds promise for fast and effective classification of other recognition tasks (e.g., pornography and other inappropriate material). When compared to more complex counterparts for violence detection, the method shows similar classification quality while being several times more efficient in terms of runtime and memory footprint.

A Lightweight and Multi-Stage Approach for Android Malware Detection Using Non-Invasive Machine Learning Techniques
Leonardo da Costa, Vitor Hugo Galhardo Moia
2023· IEEE Access22doi:10.1109/access.2023.3296606

Android has been a constant target of cybercriminals that try to attack one of the most used operating systems, commonly using malicious applications (denominated <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">malware</i> ) that, once installed on a device, can harm users in several ways. In this context, we propose an approach to detect Android <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">malware</i> consisting of a set of specific-type detectors in which each one performs a multi-stage analysis, based on rules and machine learning techniques, in different phases of the application cycle (before and after its installation). Our approach differs from state-of-the-art solutions by being non-invasive, since it leverages a process to obtain application’s features that does not infringe licenses and terms of use of applications. In addition, according to experiments performed on a real Android smartphone, our proposal presents the following additional advantages over state-of-the-art solutions: a more efficient process to classify applications that is three times faster and requires ten times less CPU usage in some cases (saving device energy); and a better detection performance, with higher balanced accuracy, nine times less false positive cases, and ten times less false negative cases.

Observation based analysis on the use of mobile applications for visually impaired users
Clauirton Siebra, Tatiana B. Gouveia, Jefté Macêdo, Walter Franklin Marques Correia +4 more
201617doi:10.1145/2957265.2961848

The current efforts to specify an usability guideline for accessible mobile applications are sparse and they are still far to present a concrete pattern. Our previous work carried out a broad survey to consolidate the findings of these efforts in a unique list with 36 requirements, 13 of them focused on vision impairments. In this paper we show the results of an observation-based analysis involving visually impaired volunteers, whose aim was to complement this review and confirm if the lack of these requirements in fact affects the use of mobile applications.

Efficient and Effective Hierarchical Feature Propagation
Jefersson A. dos Santos, Otávio A. B. Penatti, Philippe-Henri Gosselin, Alexandre X. Falcão +2 more
2014· IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing16doi:10.1109/jstars.2014.2341175

Many methods have been recently proposed to deal with the large amount of data provided by the new remote sensing technologies. Several of those methods rely on the use of segmented regions. However, a common issue in region-based applications is the definition of the appropriate representation scale of the data, a problem usually addressed by exploiting multiple scales of segmentation. The use of multiple scales, however, raises new challenges related to the definition of effective and efficient mechanisms for extracting features. In this paper, we address the problem of extracting features from a hierarchy by proposing two approaches that exploit the existing relationships among regions at different scales. The H-Propagation propagates any histogram-based low-level descriptors. The bag-of-visual-word (BoW)-Propagation approach uses the BoWs model to propagate features along multiple scales. The proposed methods are very efficient, as features need to be extracted only at the base of the hierarchy and yield comparable results to low-level extraction approaches.

Toward characterizing cardiovascular fitness using machine learning based on unobtrusive data
Maria Cecília Moraes Frade, Thomas Beltrame, Mariana de Oliveira Góis, Allan Pinto +3 more
2023· PLoS ONE16doi:10.1371/journal.pone.0282398

Cardiopulmonary exercise testing (CPET) is a non-invasive approach to measure the maximum oxygen uptake ([Formula: see text]), which is an index to assess cardiovascular fitness (CF). However, CPET is not available to all populations and cannot be obtained continuously. Thus, wearable sensors are associated with machine learning (ML) algorithms to investigate CF. Therefore, this study aimed to predict CF by using ML algorithms using data obtained by wearable technologies. For this purpose, 43 volunteers with different levels of aerobic power, who wore a wearable device to collect unobtrusive data for 7 days, were evaluated by CPET. Eleven inputs (sex, age, weight, height, and body mass index, breathing rate, minute ventilation, total hip acceleration, walking cadence, heart rate, and tidal volume) were used to predict the [Formula: see text] by support vector regression (SVR). Afterward, the SHapley Additive exPlanations (SHAP) method was used to explain their results. SVR was able to predict the CF, and the SHAP method showed that the inputs related to hemodynamic and anthropometric domains were the most important ones to predict the CF. Therefore, we conclude that the cardiovascular fitness can be predicted by wearable technologies associated with machine learning during unsupervised activities of daily living.

A Novel Scheduling Strategy for MMT-Based Multipath Video Streaming
Samira Afzal, Vanessa Testoni, Jean Felipe Fonseca de. Oliveira, Christian Esteve Rothenberg +2 more
201815doi:10.1109/glocom.2018.8648134

Bandwidth constraints and high end-to-end delays are real challenges for achieving and sustaining high quality mobile video streaming services. Diverse multipath transmission techniques are being investigated as possible solutions, since recent developments have enabled mobile devices users to receive video data simultaneously over multiple interfaces (e.g., LTE and WiFi). While some multipath protocols have been recently standardized for this purpose (e.g., MPTCP), being network layer protocols they cannot properly handle challenging transmission scenarios subject to packet losses and congestion, such as lossy wireless channels. In this work, we adopt the MPEG Media Transport (MMT) protocol to propose an improvement for mobile multipath video streaming solutions. MMT is an application layer protocol with inherent hybrid media delivery properties. We propose a novel path-and-content-aware scheduling strategy for MMT by means of full cooperation between network metrics and video content features. Our strategy provides better models to adaptively cope with unstable communication channel conditions and to improve the final user quality of experience (QoE). For the experimental evaluation, we used NS3-DCE to simulate a realistic multipath network scenario which includes channel error models and background traffic. Results for two video sequences are presented in terms of PSNR, SSIM, goodput, delay and packet loss rates. When compared with a simple scheduling strategy for the traditional multipath MMT, our approach yields significant packet loss rate reductions (~90%) and video quality improvements of around 12 dB for PSNR and 0.15 for SSIM.

Usability requirements for mobile accessibility
Clauirton Siebra, Tatiana B. Gouveia, Jefté Macêdo, Walter Franklin Marques Correia +4 more
201514doi:10.1145/2836041.2841213

When multimedia applications intend to support accessibility, aspects of usability must be reviewed to adapt or extend common functional requirements that are implemented to ensure an easy use of applications. Furthermore, these requirements must be identified and analyzed in a contextualized way, since different types of impairments require different types of requirements. This work analyzed 247 scientific and technological articles to identify requirements that are being considered to different types of impairments. The collected information was consolidated and classified according to groups of impairments. As result, this paper brings a checklist proposal focused on requirements for vision impaired users that should be considered by mobile multimedia applications to ensure accessibility with usability.

Age Estimation From Facial Parts Using Compact Multi-Stream Convolutional Neural Networks
Marcus de Assis Angeloni, Rodrigo de Freitas Pereira, Hélio Pedrini
201914doi:10.1109/iccvw.2019.00366

Age is a very useful property in the characterization of individuals, since it is an inherent biological attribute and plays a key role in many real-world applications such as preventing purchase of alcohol and tobacco by minors, human-computer interaction, soft biometrics, electronic customer relationship and as age synthesis in Forensic Art to find lost people. The aging process is influenced by external (health, lifestyle, smoking) and internal (genetics, gender) factors, which makes its estimation difficult for humans, and even more difficult for machines. In this work, we present and evaluate an age estimation approach in unconstrained images using facial parts (eyebrows, eyes, nose and mouth), cropped from the input images using landmarks, to feed a compact multi-stream convolutional neural network (CNN) architecture. Experimental results obtained in the challenging Adience benchmark with real-world images labeled with their respective age groups show that our method is competitive with the literature, even with a significantly smaller CNN and lower computational cost.

A Wearable Face Recognition System Built into a Smartwatch and the Visually Impaired User
Laurindo de Sousa Britto Neto, Vanessa Regina Margareth Lima Maike, Fernando Koch, M. Cecí­lia C. Baranauskas +2 more
201514doi:10.5220/0005370200050012

Practitioners usually expect that real-time computer vision systems such as face recognition systems will require hardware components with high processing power. In this paper, we present a concept to show that it is technically possible to develop a simple real-time face recognition system in a wearable device with low processing power ??? in this case an assistive device for the visually impaired. Our platform of choice here is the first generation Samsung Galaxy Gear smartwatch. Running solely in the watch, without pairing to a phone or tablet, the system detects a face in the image captured by the camera, and then performs face recognition (on a limited dictionary), emitting an audio feedback that either identifies the recognized person or indicates that s/he is unknown. For the face recognition approach we use a variation of the K-NN algorithm which accomplished the task with high accuracy rates. This paper presents the proposed system and preliminary results on its evaluation.