logo logo

Team

Sukhbir Benipal

Solr Consultant + Big Data Hadoop Architect

Sukhbir Benipal is our CEO. He is the brain behind the heuristic learning search algorithm used by the benipal shopping search engine and envisaged the social e-commerce platform, local and wholesale WANT Marketplaces.

Sukhbir created our B2B and B2C, O2O and C2C Marketplaces - WANT and Benipal. Live on the Android Play Store now.

He is in technology due to an epiphany, caused partly by sufficient quantities of argentinian wine on a cold winter night in New York. Having infamously flunked his M.A Economics exam, Sukhbir went on to run a Commercial Real Estate Finance company in Manhattan where he helped arrange over $ 500M in financing for different Hotels and Office Buildings.

He is responsible for code deployment across platforms and maintains our Hadoop / HBase, Search, Database and API backend clusters at our U.S datacenter.

Benipal went on to further create a self healing neural network with advanced image recognition and search for the benipal shopping search engine and built his own 10 TFlops supercomputer with 12TB RAM to help run it. For the shopping search engine, he built a ms response, massive multi PB scale image storage and delivery system. He also built his own 10G Router in 3 hours after finding out in the datacenter that Cisco would be too expensive.

He enjoys driving in the upper Himalayan ranges and likes building CentOS servers, having built his first one from scratch in two hours after reading an online how-to. His current server build / disassemble time stands at 45 minutes. Knows nothing about writing code and has an opinion on pretty much everything.

Thinks supercomputing is fun and Proud of his failure(?) with Artificial Intelligence.

Skills: CentOS, Lucene, SOLR, Hadoop HDFS, Yarn, MapReduce, HBase, mySQL, Storm, Spark, Kafka, Redis, MongoDB, Nodejs, Spring, Tomcat, HAProxy, Nginx, Android, iOS, Java, Parse, Firebase, geolocation, Linux, DataCenter, Networking, Elasticsearch, Solr Consultant in New York.

Marketplaces: Product Development for B2B Wholesale Marketplaces and B2C, O2O and C2C Mobile Shopping Marketplaces built around a Social Network and live Messaging with Photo and Video Sharing, Private Group Buying and Selling plus location based Local Users, Groups, Products and Deals Discovery. For India and Global markets. Status – Launched on Play Store.
● WANT - Global B2B shopping
● Benipal - India B2B marketplace
● WANT - B2C Marketplace in India
● want local - O2O shopping marketplace
● beni - C2C shopping marketplace

Logistics: Stealth Mode Product Development for a pan India plus hyper local logistics service to complement “Newco” Shopping Marketplace Shipping and Delivery. ● Created algorithm based optimum routing, Unique 10 digit ID based delivery location, live map delivery status, client initiated re routing, Image recognition based trusted recipient for delivery acceptance and live package plus payment confirmation.

Shopping Search Engine: 300 Million Products. 1.2B Titles and Descriptions. 1B+ Images. 12,000 Merchants.
● Voice Search and Image matching, recognition and search.
● Highly scalable infrastructure with average response times under 100ms.
● Contextual + Relational, neural network based Shopping Search Engine able to understand user queries and provide exact results for “ Blue Bedspread by Martha Stewart from Walmart or Macy's.com or around me for under $500”.
● High Volume Search + Big Data Infrastructure allowing Products and Search Queries to reflect most recent state.
● Built and Managed 40 High Performance Servers in Datacenter with 10G uplinks.


Search Engine Architect / AI Researcher / Supercomputing ● Added partial NLP to search, letting the computer “understand” the query.
● Created a self-healing, self-learning “Auto Product categorization” algorithm that can automatically analyse and categorize products in any of over 30,000 available categories. Successfully used and demonstrated success rate of around 85%.
● Built a 20 Tflops CPU based Super Computer with over 12TB RAM, 640 Cores and 1 PB Storage. Easy to add GPUs to increase total Floating Point computation.
● Theorized and Worked on Computer Vision with small GPU based cluster to provide a better understanding of how neural networks can understand images and “see” Videos.
● Theorized and Researched on Artificial Intelligence using various current open source projects and how integrated usage could provide a better understanding of neural networks and their application to live real world data scenarios by providing computers with the ability to “understand” different datasets, “see” images and videos and roughly match their interconnects.



Building a fast, feature-rich real-time search application on top of Apache Lucene Solr or Elasticsearch.
Features:
● Full-text search
● Faceting
● Highlighting
● Geo-spatial search
● Replication

● Performance
Solr is fast. Solr became a standard among search engines for a reason. It’s stable, reliable, it outperforms nearly every search solution for basic searches, except for Elasticsearch. Yet all it takes to break this powerful search engine is to search while concurrently updating the index with new content. Throw a few million documents into the index and Solr will be seriously struggling while Elasticsearch stills performs without a hitch. This becomes a serious problem if you need to update your search index regularly.
Solr just was not meant for real-time big-data search applications. The web applications today demand that new content generated by users be indexed in real time. The distributed nature of Elasticsearch allows it to keep up with concurrent search and index requests without skipping a beat.
● ElasticSearch over Solr:
● Distributed Search/Cloud-ready
Elasticsearch takes the stage is the distributed search. Elasticsearch, unlike Solr was built with distribution in mind, to be EC2-friendly. What it actually means is that Elasticsearch runs a search index on multiple servers, in a fail-safe and efficient way. And that’s quite a challenge. Distributed systems are, in general, hard to program, but when done correctly such a system is resilient in the face of malice, degrades gracefully, and its security is far superior to the others.
Elasticsearch allows you to break indices into shards with one or more replicas. The shards are hosted in a data node within the cluster that delegates operations to the correct shards with rebalancing and routing done automatically. This ensures that even, in case of some catastrophic hardware or software failure, the chances of your search server going completely offline are close to none. Elasticsearch provides a cloud support for amazon S3, as well as GigaSpaces, Coherence and Terracotta.
Even though some steps to make Solr cloud-ready have been taken, its initial architecture and design do not include it, so it will take more time to get Solr where Elasticsearch is out-of-the-box.
● Real-time search
Elasticsearch is real-time and distributed : just specify delay time via API. Its design follows percolation, an innovative search model similar to webhooks. The idea behind it is that Elasticsearch will notify your application each time new document matches your filters instead of constantly polling the search engine to check for new updates. Elasticsearch has a default refresh interval set to one second, so within only a second of indexing a document, it becomes searchable.
This is the perfect architecture for real-time search.
● JSON-based API
Elasticsearch API is clean and easy to use. You can built a modern application JSON query language provides a more powerful and useful abstraction tool for querying the documents. Elasticsearch is more accessible and pleasant to interact with than Solr.
Less configuration to set and sensible defaults make it so much more user-friendly. No schema is required, which means you can start indexing the content right away. You still can use mapping to define your index structure, which ElasticSearch uses when new indices are created.
● Solr over ElasticSearch
● Community
Solr has a mature community, and this should be a major criterion to consider when deciding which product Elasticsearch or Solr to use as a base for your application. Solr has a number of pretty active contributors that indicates it’s a stable and trustworthy search engine. But it’s not to say that Elasticsearch is far behind. Although quite young its community is vastly expanding.
Extensive Documentation
Solr is well documented with the necessary context and examples on how different APIs and components are used, while documentation for Elasticsearch lacks good working examples and configuration instructions, yet it’s slightly better organized.
Conclusion
Both are Lucene-based applications and both are open source. Solr is your search server for creating standard search applications, no massive indexing and no real time updates are required. Elasticsearch architecture is on a whole new level aimed at building modern real-time search applications. If you want distributed indexing then you need to choose Elasticsearch. Elasticsearch is the only true option for cloud and distributed environment. Elasticsearch is scalable, lightning fast and a breeze to integrate with. Its API is more intuitive and accessible than Solr’s. Less configuration to set and sensible defaults let you get the project into production very quickly.