Monday, October 23, 2023
HomeRoboticsAmr Nour-Eldin, Vice President of Expertise at LXT - Interview Collection

Amr Nour-Eldin, Vice President of Expertise at LXT – Interview Collection


Amr Nour-Eldin, is the Vice President of Expertise at LXT. Amr is a Ph.D. analysis scientist with over 16 years {of professional} expertise within the fields of speech/audio processing and machine studying within the context of Automated Speech Recognition (ASR), with a selected focus and hands-on expertise lately on deep studying methods for streaming end-to-end speech recognition.

LXT is an rising chief in AI coaching information to energy clever know-how for world organizations. In partnership with a world community of contributors, LXT collects and annotates information throughout a number of modalities with the velocity, scale and agility required by the enterprise. Their world experience spans greater than 145 international locations and over 1000 language locales.

You pursued a PhD in Sign Processing from McGill College, what initially you on this discipline?

I all the time wished to check engineering, and actually appreciated pure sciences normally, however was drawn extra particularly to math and physics. I discovered myself all the time attempting to determine how nature works and easy methods to apply that understanding to create know-how. After highschool, I had the chance to enter drugs and different professions, however particularly selected engineering because it represented the proper mixture in my opinion of each principle and utility within the two fields closest to my coronary heart: math and physics. After which as soon as I had chosen it, there have been many potential paths – mechanical, civil, and so forth. However I particularly selected electrical engineering as a result of it is the closest, and the hardest in my opinion, to the kind of math and physics issues which I all the time discovered difficult and therefore, loved extra, in addition to being the inspiration of contemporary know-how which has all the time pushed me.

Inside electrical engineering, there are numerous specializations to select from, which typically fall below two umbrellas: telecommunications and sign processing, and that of energy and electrical engineering. When the time got here to decide on between these two, I selected telecom and sign processing as a result of it is nearer to how we describe nature by way of physics and equations. You are speaking about alerts, whether or not it is audio, photographs or video; understanding how we talk and what our senses understand, and easy methods to mathematically signify that info in a means that permits us to leverage that information to create and enhance know-how.

May you focus on your analysis at McGill College on the information-theoretic side of synthetic Bandwidth extension (BWE)?

After I completed my bachelor’s diploma, I wished to maintain pursuing the Sign Processing discipline academically. After one yr of finding out Photonics as a part of a Grasp’s diploma in Physics, I made a decision to modify again to Engineering to pursue my grasp’s in Audio and Speech sign processing, specializing in speech recognition. When it got here time to do my PhD, I wished to broaden my discipline slightly bit into common audio and speech processing in addition to the closely-related fields of Machine Studying and Info Idea, relatively than simply specializing in the speech recognition utility.

The car for my PhD was the bandwidth extension of narrowband speech. Narrowband speech refers to traditional telephony speech. The frequency content material of speech extends to round 20 kilohertz, however the majority of the knowledge content material is concentrated as much as simply 4 kilohertz. Bandwidth extension refers to artificially extending speech content material from 3.4 kilohertz, which is the higher frequency sure in standard telephony, to above that, as much as eight kilohertz or extra. To raised reconstruct that lacking greater frequency content material given solely the out there slim band content material, one has to first quantify the mutual info between speech content material within the two frequency bands, then use that info to coach a mannequin that learns that shared info; a mannequin that, as soon as skilled, can then be used to generate highband content material given solely narrowband speech and what the mannequin discovered concerning the relationship between that out there narrowband speech and the lacking highband content material. Quantifying and representing that shared “mutual info” is the place info principle is available in. Info principle is the research of quantifying and representing info in any sign. So my analysis was about incorporating info principle to enhance the factitious bandwidth extension of speech. As such, my PhD was extra of an interdisciplinary analysis exercise the place I mixed sign processing with info principle and machine studying.

You had been a Principal Speech Scientist at Nuance Communications, now part of Microsoft, for over 16 years, what had been a few of your key takeaways from this expertise?

From my perspective, a very powerful profit was that I used to be all the time engaged on state-of-the-art, cutting-edge methods in sign processing and machine studying and making use of that know-how to real-world purposes. I acquired the prospect to use these methods to Conversational AI merchandise throughout a number of domains. These domains ranged from enterprise, to healthcare, automotive, and mobility, amongst others. A few of the particular purposes included digital assistants, interactive voice response, voicemail to textual content, and others the place correct illustration and transcription is essential, similar to in healthcare with physician/affected person interactions. All through these 16 years, I used to be lucky to witness firsthand and be a part of the evolution of conversational AI, from the times of statistical modeling utilizing Hidden Markov Fashions, by way of the gradual takeover of Deep Studying, to now the place deep studying proliferates and dominates virtually all points of AI, together with Generative AI in addition to conventional predictive or discriminative AI. One other key takeaway from that have is the essential position that information performs, by way of amount and high quality, as a key driver of AI mannequin capabilities and efficiency.

You’ve printed a dozen papers together with in such acclaimed publications as IEEE. In your opinion, what’s the most groundbreaking paper that you just printed and why was it essential?

Probably the most impactful one, by variety of citations in keeping with Google Scholar, can be a 2008 paper titled “Mel-Frequency Cepstral Coefficient-Primarily based Bandwidth Extension of Narrowband Speech”. At a excessive degree, the main target of this paper  is about easy methods to reconstruct speech content material utilizing a characteristic illustration that’s broadly used within the discipline of computerized speech recognition (ASR), mel-frequency cepstral coefficients.

Nonetheless, the extra revolutionary paper in my opinion, is a paper with the second-most citations, a 2011 paper titled “Reminiscence-Primarily based Approximation of the Gaussian Combination Mannequin Framework for Bandwidth Extension of Narrowband Speech“. In that work, I proposed a brand new statistical modeling method that includes temporal info in speech. The benefit of that method is that it permits modeling long-term info in speech with minimal further complexity and in a trend that also additionally permits the technology of wideband speech in a streaming or real-time trend.

In June 2023 you had been recruited as Vice President of Expertise at LXT, what attracted you to this place?

All through my educational {and professional} expertise previous to LXT, I’ve all the time labored straight with information. The truth is, as I famous earlier, one key takeaway for me from my work with speech science and machine studying was the essential position information performed within the AI mannequin life cycle. Having sufficient high quality information in the best format was, and continues to be, very important to the success of state-of-the-art deep-learning-based AI. As such, after I occurred to be at a stage of my profession the place I used to be in search of a startup-like surroundings the place I may be taught, broaden my abilities, in addition to leverage my speech and AI expertise to have essentially the most influence, I used to be lucky to have the chance to affix LXT. It was the proper match. Not solely is LXT an AI information supplier that’s rising at a powerful and constant tempo, however I additionally noticed it as on the excellent stage when it comes to progress in AI know-how in addition to in consumer dimension and variety, and therefore in AI and AI information sorts. I relished the chance to affix and assist in its progress journey; to have a huge impact by bringing the attitude of an information finish person after having been an AI information scientist person for all these years.

What does your common day at LXT appear to be?

My common day begins with wanting into the newest analysis on one subject or one other, which has currently centered round generative AI, and the way we will apply that to our clients’ wants. Fortunately, I’ve a superb crew that may be very adept at creating and tailoring options to our shoppers’ often-specialized AI information wants. So, I work carefully with them to set that agenda.

There may be additionally, in fact, strategic annual and quarterly planning, and breaking down strategic aims into particular person crew targets and retaining on top of things with developments alongside these plans. As for the characteristic growth we’re doing, we typically have two know-how tracks. One is to verify now we have the best items in place to ship one of the best outcomes on our present and new incoming tasks. The opposite monitor is bettering and increasing our know-how capabilities, with a give attention to incorporating machine studying into them.

May you focus on the sorts of machine studying algorithms that you just work on at LXT?

Synthetic intelligence options are remodeling companies throughout all industries, and we at LXT are honored to offer the high-quality information to coach the machine studying algorithms that energy them. Our clients are engaged on a variety of purposes, together with augmented and digital actuality, laptop imaginative and prescient, conversational AI, generative AI, search relevance and speech and pure language processing (NLP), amongst others. We’re devoted to powering the machine studying algorithms and applied sciences of the longer term by way of information technology and enhancement throughout each language, tradition and modality.

Internally, we’re additionally incorporating machine studying to enhance and optimize our inside processes, starting from automating our information high quality validation, to enabling a human-in-the-loop labeling mannequin throughout all information modalities we work on.

Speech and audio processing is quickly approaching close to perfection with regards to English and particularly white males. How lengthy do you anticipate it is going to be till it’s a fair taking part in discipline throughout all languages, genders, and ethnicities?

This can be a difficult query, and depends upon plenty of elements, together with the financial, political, social and technological, amongst others. However what is evident is that the prevalence of the English language is what drove AI to the place we at the moment are. So to get to a spot the place it is a degree taking part in discipline actually depends upon the velocity at which the illustration of information from totally different ethnicities and populations grows on-line, and the tempo at which it grows is what is going to decide after we get there.

Nonetheless, LXT and related corporations can have a giant hand in driving us towards a extra degree taking part in discipline. So long as the info for much less well-represented languages, genders and ethnicities is tough to entry or just not out there, that change will come extra slowly. However we try to do our half. With protection for over 1,000 language locales and expertise in 145 international locations, LXT helps to make entry to extra language information doable.

What’s your imaginative and prescient for the way LXT can speed up AI efforts for various shoppers?

Our objective at LXT is to offer the info options that allow environment friendly, correct, and quicker AI growth. By our 12 years of expertise within the AI information area, not solely have we amassed intensive know-how about shoppers’ wants when it comes to all points referring to information, however now we have additionally repeatedly fine-tuned our processes with a view to ship the very best high quality information on the quickest tempo and finest worth factors. Consequently, because of our steadfast dedication to offering our shoppers the optimum mixture of AI information high quality, effectivity, and pricing, now we have grow to be a trusted AI information associate as evident by our repeat shoppers who preserve coming again to LXT for his or her ever-growing and evolving AI information wants. My imaginative and prescient is to cement, enhance and develop that LXT “MO” to all of the modalities of information we work on in addition to to all sorts of AI growth we now serve, together with generative AI. Attaining this objective revolves round strategically increasing our personal machine studying and information science capabilities, each when it comes to know-how in addition to sources.

Thanks for the nice interview, readers who want to be taught extra ought to go to LXT.



Supply hyperlink

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments