Meta/xAI Big Model Debuts, WiMi Accelerates the Integration of Multimodal AI and Scenario Applications


US tech giant Meta (META) has released its latest open-source Artificial Intelligence (AI) model, Llama 3. Llama 3 contains both 8 billion (8B) and 70 billion (70B) parameters and is now accessible through mainstream cloud service platforms.

Llama 3, the most powerful open-source big model, will be available online

In terms of performance, Meta said that Llama 3 compared to the old version has significant improvements, the model in the answer prompts has a richer diversity, refused to answer the question of the number of errors has been reduced, and reasoning ability is also stronger.

Additionally, Meta says that Llama 3 with both parameters beats similarly large models such as Google’s Gemma and Gemini, Mistral 7B and Anthropic’s Claude 3 in certain benchmarks.

Meta considers Llama 3 to be the best open-source big model on the market. Along with the release of Llama 3, Meta also announced Meta AI, an AI assistant powered by the macromodel. initially announced during last year’s Connect conference, Meta AI has been further expanded for use.

xAI’s First Multimodal Large Model Debuts

In the uncontested “sailing race” of AI, big models have become the lighthouse in front of us. Coincidentally, since Musk’s xAI has been making progress in the field of large models, less than a month after Grok-1 was made available, xAI’s first multimodal model was released.

Recently, xAI reportedly unveiled Grok-1.5V, a model that not only understands text, but also processes content in documents, charts, screenshots, and photographs. xAI says Grok-1.5V rivals current top multimodal models in many areas, from multidisciplinary reasoning to understanding documents, scientific charts, diagrams, screenshots, and photographs.

If Grok-1.5 can be released under an open-source license similar to Grok-1, it will be a major node in the current LLM competitive landscape. Over the next few months, xAI expects to significantly improve modelling capabilities across a variety of modalities including image, audio, and video.

Nowadays, looking at the layout of the global AI market, the overseas market, marked by GPT4.0 and Sora, shows the breakthrough of large models in multimodalization and text-to-birth-video capabilities; the domestic market also has a stunning performance, Kimi, an AI application that is famous for processing long text, has rapidly become popular, and major manufacturers are competing for the prize and harvesting strong market feedback.

WiMi is moving along the multimodal big model

Data show that, as one of the big factories active in the front stage of the big model, WiMi Hologram Cloud Inc. (NASDAQ: WIMI), with its technical and ecological advantages, utilizes the big model to reconstruct all the products, especially the AI big model has been deeply applied to the application area, seeks to accelerate the application of the matching scene to the ground, and promotes the double enhancement of the user experience and the business efficiency, and comprehensively develops the multi-modal The commercialization path of “AI big model +” is fully explored, and the multimodal AI+ software and hardware ecological application is built together.

At present, WiMi, based on a series of AI ecological pendant big models built by the company’s many years of technology accumulation, has opened up the chain from evolution to landing of AI big models through the deep integration of the AI+ industry, especially in multiple fields such as live broadcasting, digital people, digital office, and so on. Under the comprehensive empowerment of the big models, WiMi’s ecology based on self-researched AI big models as a base has taken shape, and in the storm of “AI use” set off by the big tech companies, it has made a good start and laid down a certain leading edge.

At the same time, the big model has spawned new opportunities for applications to fully land, and the multimodal big model has gradually made a series of notable achievements. In this regard, WiMi will create an AI big model scene matrix with a “multi-modal scene” as the core through the use of general core technology of AI, and provide users with smarter and more comprehensive multi-modal big model AI+ services by combining the capabilities of multi-modal big model, big language model, and Vincennes map model, and continuously promote the development of digital industry driven by AI, with the help of WiMi. AI-driven digital industrialization and industrial digitization construction.


Currently, the competition in the field of large models on a global scale will continue to enhance the overall capability level of large models.2024, it can be said that large model applications are in the windy era of the great outbreak of new species, and AI large models continue to inject new vitality into AGI on the occasion of the collision and fusion of AI large models and industry. In the future era of artificial intelligence, the multimodal AI big model is the strategic high point and builds the commercial closed loop of data, model and application, software and hardware go hand in hand, which will probably become the key force to promote the development of artificial intelligence.