Microsoft (Portugal)
companyLisbon, Portugal
Research output, citation impact, and the most-cited recent papers from Microsoft (Portugal) (Portugal). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Microsoft (Portugal)
In the increasingly complex service environment, value is cocreated through webs of interactions between provider networks and customer networks. This is evident in healthcare services, where well-being can be achieved only through the joint efforts of professional healthcare networks and patient networks. Addressing the challenge of creating network-level services, the service design for value networks (SD4VN) method designs services as enablers of many-to-many value cocreating interactions among network actors. By integrating previous research on value networks and service design, SD4VN develops a process and a set of models beyond supporting dyadic (customer–service provider) interactions to understanding the interrelated activities, interactions, and goals of network actors and designing services to support the different actors in reaching their goals with balanced centricity. Following a design science research approach, this paper presents the SD4VN method and reports on a case application of the method used to design the Portuguese national electronic health record service Plataforma de Dados da Saúde (PDS). The case application involved focus groups, in-depth interviews, and participatory design sessions with over 170 participants at different service design stages, showing the importance of designing a balanced, integrated service. The case application also shows how SD4VN can support a wider adoption of the service and improve the health service system.
We describe an experiment that explores the contribution of auditory and other features to the illusion of plausibility in a virtual environment that depicts the performance of a string quartet. 'Plausibility' refers to the component of presence that is the illusion that the perceived events in the virtual environment are really happening. The features studied were: Gaze (the musicians ignored the participant, the musicians sometimes looked towards and followed the participant's movements), Sound Spatialization (Mono, Stereo, Spatial), Auralization (no sound reflections, reflections corresponding to a room larger than the one perceived, reflections that exactly matched the virtual room), and Environment (no sound from outside of the room, birdsong and wind corresponding to the outside scene). We adopted the methodology based on color matching theory, where 20 participants were first able to assess their feeling of plausibility in the environment with each of the four features at their highest setting. Then five times participants started from a low setting on all features and were able to make transitions from one system configuration to another until they matched their original feeling of plausibility. From these transitions a Markov transition matrix was constructed, and also probabilities of a match conditional on feature configuration. The results show that Environment and Gaze were individually the most important factors influencing the level of plausibility. The highest probability transitions were to improve Environment and Gaze, and then Auralization and Spatialization. We present this work as both a contribution to the methodology of assessing presence without questionnaires, and showing how various aspects of a musical performance can influence plausibility.
INTRODUCTION: Innovations in 3D spatial technology and augmented reality imaging driven by digital high-tech industrial science have accelerated experimental advances in breast cancer imaging and the development of medical procedures aimed to reduce invasiveness. PRESENTATION OF CASE: A 57-year-old post-menopausal woman presented with screen-detected left-sided breast cancer. After undergoing all staging and pre-operative studies the patient was proposed for conservative breast surgery with tumor localization. During surgery, an experimental digital and non-invasive intra-operative localization method with augmented reality was compared with the standard pre-operative localization with carbon tattooing (institutional protocol). The breast surgeon wearing an augmented reality headset (Hololens) was able to visualize the tumor location projection inside the patient's left breast in the usual supine position. DISCUSSION: This work describes, to our knowledge, the first experimental test with a digital non-invasive method for intra-operative breast cancer localization using augmented reality to guide breast conservative surgery. In this case, a successful overlap of the previous standard pre-operative marks with carbon tattooing and tumor visualization inside the patient's breast with augmented reality was obtained. CONCLUSION: Breast cancer conservative guided surgery with augmented reality can pave the way for a digital non-invasive method for intra-operative tumor localization.
The diversity of data collected on both social networks and digital interfaces is extremely increased, raising the problem of heterogeneous variables that are not often favourable to classification algorithms. Despite the significant improvement in machine learning (ML) and predictive analysis efficiency for classification in customer relationship management systems (CRM), their performance remains very limited by heterogeneous data processing, class imbalance, and feature scales. This impact turned out to be more important for simple ML methods which in addition often suffer from over-fitting. This paper proposes a succinct and detailed ML model building process including cross-validation of the combination of SMOTE to balance data and ensemble methods for modelling. From the conducted experiments, the random forest (RF) model yielded the best performance of 0.86 in terms of accuracy and f1-scoreusing balanced data. It confirms the literature summary about this topic which shows that RF was among the most effective algorithms for customer predictive classification issues. The constructed and optimized models were interpreted by Shapley values and feature importance analysis which shows that the “age” feature was the most significant while “HasCrCard” was the less one. This process has proven effective in bridging previously reported research gaps and the resulting model should be used for supporting bank customer loyalty decision-making.
Environmental issues have gradually gained attention in the last decade because of increased global warming and high waste production. Therefore, this article aims to add value to the environment management research by analyzing green product innovation through market orientation. Moreover, this study includes green self-efficacy as a mediator, being less focused in the past literature to examine employees’ confidence in innovating green products according to customers’ needs. In addition, resource bricolage is also introduced as a moderator because fewer studies display the empirical results about organizations producing or tend to produce innovated green products with a limited number of resources. Data were collected from 477 employees of small and medium-sized enterprises using a self-administered questionnaire in Pakistan. Empirical results revealed by SmartPLS software delineate that market orientation has a positive and significant impact on green self-efficacy and green product innovation. Moreover, green self-efficacy shows a significant mediation impact between market orientation and green product innovation. Additionally, resource bricolage also moderates the relationship between market orientation and green product innovation. Overall, the study contributes to theoretical and practical knowledge about green product innovation in tackling the world’s environmental issues.
The PaeLife project is a European industry-academia collaboration whose goal is to provide the elderly with easy access to online services that make their life easier and encourage their continued participation in the society. To reach this goal, the project partners are developing a multimodal virtual personal life assistant (PLA) offering a wide range of services from weather information to social networking. This paper presents the multimodal architecture of the PLA, the services provided by the PLA, and the work done in the area of speech input and output modalities, which play a key role in the application.
This paper presents a study of European Portuguese elderly speech, in which the acoustic characteristics of two groups of elderly speakers (aged 60-75 and over 75) are compared with those of young adult speakers (aged 19-30). The correlation between age and a set of 14 acoustic features was investigated, and decision trees were used to establish the relative importance of the features. A greater use of pauses characterized speakers aged 60 and over. For female speakers, speech rate also appeared to correlate with age. For male speakers, jitter distinguished between speakers aged 60-75 and older. The correlation between the features and speech recognition performance was also investigated. Word error rate correlated mostly with the use of pauses, speech rate, and the ratio of long phone realizations. Finally, by comparing the phone sequences used by the recognizer on the most frequent words, we observed that the young adult speakers reduced schwas more than the elderly speakers. This result seems to confirm the common idea that young speakers reduce articulation more than older speakers. Further investigation is needed to confirm this result by determining whether this is due to ageing or to the generation gap.
The use of machine learning (ML) methods has been widely discussed for over a decade. The search for the optimal model is still a challenge that researchers seek to address. Despite advances in current work that surpass the limitations of previous ones, research still faces new challenges in every field. For the automatic targeting of customers in a banking telemarketing campaign, the use of ML-based approaches in previous work has not been able to show transparency in the processing of heterogeneous data, achieve optimal performance or use minimal resources. In this paper, we introduce a class membership-based (CMB) classifier which is a transparent approach well adapted to heterogeneous data that exploits nominal variables in the decision function. These dummy variables are often either suppressed or coded in an arbitrary way in most works without really evaluating their impact on the final performance of the models. In many cases, their coding either favours or disfavours the learning model performance without necessarily reflecting reality, which leads to over-fitting or decreased performance. In this work, we applied the CMB approach to data from a bank telemarketing campaign to build an optimal model for predicting potential customers before launching a campaign. The results obtained suggest that the CMB approach can predict the success of future prospecting more accurately than previous work. Furthermore, in addition to its better performance in terms of accuracy (97.3%), the model also gives a very close score for the AUC (95.9%), showing its stability, which would be very unfavourable to over-fitting.
A Silent Speech Interface (SSI) aims at performing Automatic Speech Recognition (ASR) in the absence of an intelligible acoustic signal. It can be used as a human-computer interaction modality in high-background-noise environments, such as living rooms, or in aiding speech-impaired individuals, increasing in prevalence with ageing. If this interaction modality is made available for users own native language, with adequate performance, and since it does not rely on acoustic information, it will be less susceptible to problems related to environmental noise, privacy, information disclosure and exclusion of speech impaired persons. To contribute to the existence of this promising modality for Portuguese, for which no SSI implementation is known, we are exploring and evaluating the potential of state-of-the-art approaches. One of the major challenges we face in SSI for European Portuguese is recognition of nasality, a core characteristic of this language Phonetics and Phonology. In this paper a silent speech recognition experiment based on Surface Electromyography is presented. Results confirmed recognition problems between minimal pairs of words that only differ on nasality of one of the phones, causing 50% of the total error and evidencing accuracy performance degradation, which correlates well with the exiting knowledge.
This paper presents a multimodal prototype application that aims to promote the social integration of the elderly. The application enables communication with their social network through conferencing and social media services, using natural interaction modalities, like speech, touch and gestures. We begin by discussing the requirements and design guidelines that were taken into account for the development of the prototype. We also present the key elements of the development stage and the results of a usability study conducted with ten elderly volunteers. The usability study reveals that such a multimodal solution can simplify accessibility to the considered services. Results indicate that this system is simpler, more natural and more enjoyable than the current user interfaces. Furthermore, the natural interaction modalities of the proposed prototype, allow elderly to be more efficient and have a better user experience, thus contributing with an easier and faster way for this population to join the information era.
The PaeLife project is a European industry-academia collaboration in the framework of the Ambient Assisted Living Joint Programme (AAL JP), with a goal of developing a multimodal, multilingual virtual personal life assistant to help senior citizens remain active and socially integrated. Speech is one of the key interaction modalities of AALFred, the Windows application developed in the project; the application can be controlled using speech input in four European languages: French, Hungarian, Polish and Portuguese. This paper briefly presents the personal life assistant and then focuses on the speech-related achievements of the project. These include the collection, transcription and annotation of large corpora of elderly speech, the development of automatic speech recognisers optimised for elderly speakers, a speech modality component that can easily be reused in other applications, and an automatic grammar translation service that allows for fast expansion of the automatic speech recognition functionality to new languages.
In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion
Abstract To achieve a full‐scale simulation of a pyrite mine, a highly immersive environment becomes necessary and this research has led to a complex system enabling users to walk through a virtual mine in real time, presenting all the behaviours present in such environment. Some of the problems encountered are the tunnels behaviours, including highly contrasted images due to the presence of the head light, narrow paths, elevators, sound reverberation and tunnels texture shades. The use of immersive virtual reality enables the generation of high‐quality simulations, because it is possible to control several feedback mechanisms such as the degree of luminance of produced imagery and spatial sound. In this research, a projection infrastructure and tracking system were specified and developed, aiming at producing the best results for this kind of simulation. To achieve our purposes, distributed algorithms were developed to run in a cluster solution that drives a four‐sided CAVE‐like environment. The complete production pipeline is presented, ranging from the developed authoring techniques, enabling fast production of new content for the simulation, to the tracking techniques produced for the improvement of the interaction.
Abstract The See Change survey was designed to make z > 1 cosmological measurements by efficiently discovering high-redshift Type Ia supernovae (SNe Ia) and improving cluster mass measurements through weak lensing. This survey observed twelve galaxy clusters with the Hubble Space Telescope (HST) spanning the redshift range z = 1.13–1.75, discovering 57 likely transients and 27 likely SNe Ia at z ∼ 0.8–2.3. As in similar previous surveys, this proved to be a highly efficient use of HST for supernova observations; the See Change survey additionally tested the feasibility of maintaining, or further increasing, the efficiency at yet higher redshifts, where we have less detailed information on the expected cluster masses and star formation rates. We find that the resulting number of SNe Ia per orbit is a factor of ∼8 higher than for a field search, and 45% of our orbits contained an active SN Ia within 22 rest-frame days of peak, with one of the clusters by itself yielding 6 of the SNe Ia. We present the survey design, pipeline, and supernova discoveries. Novel features include fully blinded supernova searches, the first random forest candidate classifier for undersampled IR data (with a 50% detection threshold within 0.05 mag of human searchers), real-time forward-modeling photometry of candidates, and semi-automated photometric classifications and follow-up forecasts. We also describe the spectroscopic follow-up, instrumental in measuring host galaxy redshifts. The cosmology analysis of our sample will be presented in a companion paper.
In this paper we present a free, open source platform, that translates in real time (written) European Portuguese into Portuguese Sign Language, being the signs produced by an avatar. We discuss basic needs of such a system in terms of Natural Language Processing and Animation Synthesis, and propose an architecture for it. Moreover, we have selected a set of existing tools that couple with our free, open-source philosophy, and implemented a prototype with them. Several case studies were conducted. A preliminary evaluation was done and, although the translation possibilities are still scarce and some adjustments still need to be done, our platform was already much welcomed by the deaf community.
We present a pilot study aiming to explore the use of biometrics sensing technology within a semi-immersive VR environment, where users face architectural spaces which induce them sensations close to fear of heights, claustrophobia, frustration and relief. Electrodermal activity was used to detect users emotional arousal, while navigating in VR. Navigation conditions and participants expertise with games were controlled. Main results show that physiological measurement of userÂs perceptions can discriminate well "positive" from "negative" spaces, providing designers with basic information on peopleÂs emotional state when using the buildings they design.
Technologies for Human-Computer Interaction (HCI) and Communication have evolved tremendously over the past decades. However, citizens such as mobility impaired or elderly or others, still face many difficulties interacting with communication services, either due to HCI issues or intrinsic design problems with the services. In this paper we start by presenting the results of two user studies, the first one conducted with a group of mobility impaired users, comprising paraplegic and quadriplegic individuals; and the second one with elderly. The study participants carried out a set of tasks with a multimodal (speech, touch, gesture, keyboard and mouse) and multi-platform (mobile, desktop) system, offering an integrated access to communication and entertainment services, such as email, agenda, conferencing, instant messaging and social media, referred to as LHC - Living Home Center. The system was designed to take into account the requirements captured from these users, with the objective of evaluating if the adoption of multimodal interfaces for audio-visual communication and social media services, could improve the interaction with such services. Our study revealed that a multimodal prototype system, offering natural interaction modalities, especially supporting speech and touch, can in fact improve access to the presented services, contributing to the reduction of social isolation of mobility impaired, as well as elderly, and improving their digital inclusion.
Living Usability Lab for Next Generation Networks (www.livinglab.pt) is a Portuguese industry-academia collaborative R&D project, active in the field of live usability testing, focusing on the development of technologies and services to support healthy, productive and active citizens.The project adopts the principles of universal design and natural user interfaces (speech, gesture) making use of the benefits of next generation networks and distributed computing.Therefore, it will impact the general population, including the elderly and citizens with permanent or situational special needs.This paper presents project motivations, conceptual model, architecture and work in progress.
Human industrial activities are bringing physiochemical changes to the land, air, and seas and leading towards more uncertain climate changes, like drought, thunderstorms, and heat waves. This has resulted in water scarcity because of overexploitation of water resources. It is therefore imperative to develop effective conservation programs that consider the factors that affect the decisions of people with regard to water conservation and sustainable activities. This study considered the perspective of a developing country and explored the impact of three psychosocial factors, i.e., subjective happiness, perceived stress, and personal well-being, on individuals’ current and future intentions to conserve water. A sample of 304 respondents was collected via a self-administered questionnaire containing measures of demographic characteristics, psychological factors, and current and future water conservation behavior. The data were collected online as well as through hard copies. Correlational analysis showed that the three psychosocial factors had significant associations with both current and future intentions to conserve water. Furthermore, the effect size (f2) demonstrated that personal well-being was a significant predictor of current and future water conservation behavior. Stress, however, did not serve as a significant predictor of either current or future water conservation behavior. In contrast, subjective happiness was a significant predictor of only future water conservation behavior.
A method is proposed for compiling a corpus of phonetically-rich triphone sentences; i.e., sentences with a high variety of triphones, distributed in a uniform fashion. Such a corpus is of interest for a wide range of contexts, from automatic speech recognition to speech therapy. We evaluated this method by building phonetically-rich corpora for Brazilian Portuguese. The data employed comes from Wikipedia's dumps, which were converted into plain text, segmented and phonetically transcribed. The method consists of comparing the distance between the triphone distribution of the available sentences to an ideal uniform distribution, with equiprobable triphones. A greedy algorithm was implemented to recognize and evaluate the distance among sentences. A heuristic metric is proposed for pre-selecting sentences for the algorithm, in order to quicken its execution. The results show that, by applying the proposed metric, one can build corpora with more uniform triphone distributions.