We directly align the representation of point cloud and language without the need to align with image modality additionally as classical methods. Furthermore, we explore the great potential of point ...